fajifr / recontent

Content recommendation API
MIT License
0 stars 1 forks source link

building a subsampled corpus from arxiv bulk pdf #2

Open jiabin-liu opened 8 years ago

jiabin-liu commented 8 years ago

Have a subsampled toy corpus up by Monday. Work on map-reduce of the whole arxiv database and get the complete corpus out.