HazyResearch / bazaar

14 stars 11 forks source link

Auto-calculate the batch size? #7

Closed ajratner closed 9 years ago

ajratner commented 9 years ago

@raphaelhoffmann Especially since the loading time of coreNLP is so (relatively) long, it seems like a better default than batch_size=1000 would be to divide lines by the number of cores (or cores*nodes for distribute). For example I just got a 2x speedup on an ec-2 node really easily here (had forgotten to set batch size first time around...). Thoughts?

raphaelhoffmann commented 9 years ago

That's a great idea! Please check this in.

ajratner commented 9 years ago

Sure, will do tomorrow

On Tue, Aug 4, 2015 at 12:29 AM Raphael Hoffmann notifications@github.com wrote:

That's a great idea! Please check this in.

— Reply to this email directly or view it on GitHub https://github.com/HazyResearch/bazaar/issues/7#issuecomment-127507203.

ajratner commented 9 years ago

Will push my new wrapper functions (in the fabfile) when done processing everything...