Lab41 / soft-boiled

Library for Geo-Inferencing in Twitter Data
Other
28 stars 11 forks source link

Refactor gmm.predict_user to improve performance #29

Closed ymt123 closed 9 years ago

ymt123 commented 9 years ago

Reorder terms to improve performance. Specifically moving the tweet tokenization to be before the first group by dramatically reduces the amount of data that must be shuffled.