timm / 700timesfaster

0 stars 0 forks source link

needs to comment on the need to make things faster #2

Closed timm closed 6 years ago

timm commented 6 years ago

you should have an invite from https://www.sharelatex.com/project/587fea8b93554fb1518c4b79

that paper contains text like the following. you cant use it verbatim but something like the following would better motivate the work

and that better motivation has to be in abstract, intro, section2, conclusion

"While deep learning is an exciting new technique, the benefits ofthis method need to be assessed with respect to its computationalcost. This is particularly important for deep learning since theselearners need hours (to weeks) to train the model. Such long trainingtime limits the ability of (a) a researcher to test the stability of theirconclusion via repeated runs with different random seeds; and(b) other researchers to repeat, improve, or even refute that originalwork."

"This paper debates what methods should be recommended tothose wishing to repeat the analysis of XU. We focus on whetherusing simple and faster methods can achieve the results that are cur-rently achievable by the state-of-art deep learning method. Specifi-cally, we ..."

"We offer these results as a cautionary tale to the software analyt-ics community. While deep learning is an exciting new technique,the benefits of this method need to be carefully assessed with re-spect to its computational cost. More generally, if researchers deploysome new and expensive process (like deep learning), that workshould be baselined against some simpler and faster alternatives"

his section argues that avoiding slow methods for software ana-lytics is an open and urgent issue.Researchers and industrial practitioners now routinely makeextensive use of software analytics to discover (e.g.) how long itwill take to integrate the new code [17], where bugs are mostlikely to occur [54], who should fix the bug [2], or how long itwill take to develop their code [34,35,50]. Large organizations likeMicrosoft routinely practice data-driven policy development whereorganizational policies are learned from an extensive analysis oflarge data sets collected from developers [7, 65].But the more complex the method, the harder it is to apply theanalysis. Fisher et al. [20] characterizes software analytics as awork flow that distills large quantities of low-value data downto smaller sets of higher value data. Due to the complexities andcomputational cost of SE analytics, “the luxuries of interactivity,direct manipulation, and fast system response are gone” [20]. Theycharacterize modern cloud-based analytics as a throwback to the1960s-batch processing mainframes where jobs are submitted andthen analysts wait, wait, and wait for results with “little insight intowhat is really going on behind the scenes, how long it will take, orhow much it is going to cost” [20]. Fisher et al. [20] document theissues seen by 16 industrial data scientists, one of whom remarks“Fast iteration is key, but incompatible with the jobsare submitted and processed in the cloud. It is frus-trating to wait for hours, only to realize you need aslight tweak to your feature set”.Methods for improving the quality of modern software analyticshave made this issue even more serious. There has been continuousdevelopment of new feature selection [25] and feature discover-ing [28] techniques for software analytics, with the most recentones focused on deep learning methods. These are all exciting in-novations with the potential to dramatically improve the quality ofour software analytics tools. Yet these are all CPU/GPU-intensivemethods. For instance:Learning control settings for learners can take days to weeks toyears of CPU time [22, 64, 69].•Lam et al. needed weeks of CPU time to combine deep learningand text mining to localize buggy files from bug reports [39].•Gu et al. spent240hours of GPU time to train a deep learningbased method to generate API usage sequences for given naturallanguage query [24].Note that the above problem is not solvable by waiting for fasterCPUs/GPUs. We can no longer rely on Moore’s Law [51] to doubleour computational power every 18 months. Power consumption andheat dissipation issues effect block further exponential increasesto CPU clock frequencies [38]. Cloud computing environments areextensively monetized so the total financial cost of training modelscan be prohibitive, particularly for long running tasks. For example,it would take 15 years of CPU time to learn the tuning parametersof software clone detectors proposed in [69]. Much of that CPUtime can be saved if there is a faster way

Suvodeep90 commented 6 years ago

Added a subsection named Why we need faster models in section 2.