The idea here is to a challenge text analysis task, as a framework .Rmd or .ipynb file and associated dataset. (I propose the Large Movie Review Dataset from http://ai.stanford.edu/~amaas/data/sentiment/).
The purpose is to gauge the complexity of the execution of the task for a given tool, as well as its performance.
The files would be named
taskname_tool_submittername.Rmd/ipynb
and would include timings at each stage, as output, plus a table of timings for the task to be written to a common file.
Each participant could submit any number of trials based on any number of different tools, and would knit this into a .md file or in the Jupyter case, simply execute all the code in the file.
The week before the contest, someone would re-run all of the files on a single machine, and push the results back to the repo, so that the performance comparisons took place on the same hardware.
We could consider options for parallelization - say, single-thread versus all available threads - and perhaps run each trial multiple times and average the timings.
The idea here is to a challenge text analysis task, as a framework .Rmd or .ipynb file and associated dataset. (I propose the Large Movie Review Dataset from http://ai.stanford.edu/~amaas/data/sentiment/).
The purpose is to gauge the complexity of the execution of the task for a given tool, as well as its performance.
The files would be named
and would include timings at each stage, as output, plus a table of timings for the task to be written to a common file.
Each participant could submit any number of trials based on any number of different tools, and would knit this into a .md file or in the Jupyter case, simply execute all the code in the file.
The week before the contest, someone would re-run all of the files on a single machine, and push the results back to the repo, so that the performance comparisons took place on the same hardware.
We could consider options for parallelization - say, single-thread versus all available threads - and perhaps run each trial multiple times and average the timings.