codota / codota-plugin-intellij

Hub for all issues related to the Codota plugin for the JetBrains platforms: IntelliJ, Android Studio, WebStorm & PhpStorm
8 stars 4 forks source link

Let's massively improve codota accuracy #3

Open LifeIsStrange opened 4 years ago

LifeIsStrange commented 4 years ago

I didn't knew where to open the issue regarding improving the algorithm™ used by codata/tabnine so I choosed this repo.

First of all I find your product very promising but it must have a lot of false positives and true negatives. Luckily the state of the art accuracy in AI tasks improve each years significantly!

I believe that the main AI task that you must do (correct me if I'm wrong) is language modeling AKA predicting the next tokens https://paperswithcode.com/task/language-modelling In 2020, a new neural network called SMIM has made a revolutionary accuracy improvment and this is not an understatement!

(at least on this reference benchmarck) https://paperswithcode.com/sota/language-modelling-on-penn-treebank-word That is: we go from 35.76 of text perplexity to 4.6!!! And with order of magnitude less parameters than GPT-2!!

I believe that tabnine (maybe even codata) use GPT (1 or 2?) According to your expertise and experimental trials, you will determine how much those improvements can be transposed to code completion.

BTW AI would be so much better if paperswithcode.com (the reference of state of the art leaderboards in AI tasks) was more widely known!

Let me know what you think :}

avichay77 commented 4 years ago

Thanks for these ideas. We are constantly working on improving the model and plugin. Any specific examples, with screenshots etc, are welcome.