EleutherAI / gpt-neo

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
https://www.eleuther.ai
MIT License
8.22k stars 945 forks source link

Anyone want to collaborate on tabnine-style auto completion? #209

Closed traverseda closed 2 years ago

traverseda commented 3 years ago

Tabnine was the best autocomplete I ever used, but it's closed source and increasingly expensive. Creating something similar would be a very large project. I'm available to help out a little bt, particularly with data-preparation, and probably with langserver development.

This seems like a good place to co-ordinate and find other people interested in doing that, so hopefully my opening an issues here isn't a problem.

wolfgangmeyers commented 3 years ago

This would need to be a cloud-based service probably, I think the latency of running it locally on the cpu might be a little too much to come close to TabNine response time.

traverseda commented 3 years ago

That's unfortunate, I know tabnine manages to run completely locally although it of course has a smaller model. Latency from some cloud would also probably be pretty bad.

Tabnine learns common code idioms and patterns by training powerful ML models on code. Our most powerful models use over 380M parameters and are an extension of GPT-2 specialized for code (combining syntactic and semantic information).

Tabnine defaults to a local configuration. Models are downloaded to your local development machine only - the code never leaves your machine.

Well maybe the performance will improve at some point in the future. Would using a similarly sized model help with latency?

wolfgangmeyers commented 3 years ago

That's unfortunate, I know tabnine manages to run completely locally although it of course has a smaller model. Latency from some cloud would also probably be pretty bad.

Tabnine learns common code idioms and patterns by training powerful ML models on code. Our most powerful models use over 380M parameters and are an extension of GPT-2 specialized for code (combining syntactic and semantic information).

Tabnine defaults to a local configuration. Models are downloaded to your local development machine only - the code never leaves your machine.

Well maybe the performance will improve at some point in the future. Would using a similarly sized model help with latency?

It would be worth testing. Based on my own testing using the HuggingFace api, it seems like the time it takes to do CPU based inference goes up depending on the amount of text to be returned. It might be worth testing out the smaller model on a local CPU to see what the latency is.

bentrevett commented 3 years ago

I would be interested in this. The main challenge, as already mentioned, would be reducing the latency - pruning? model distillation? - to run performant models locally.

As a relevant paper I would suggest Fast and Memory-Efficient Neural Code Completion - a relatively recent paper which focuses on the application of a code completion model in terms of latency and model size.

saahiluppal commented 3 years ago

It's worth mentioning that TabNine not only considers the code from the current file but from the entire workfolder and manages to provide completion with model smaller than 20mb.

Simple pruning won't work at all but continuous training pruning (like in lottery ticket hypothesis) might work. Distillation might not bring any benefit.

If you are going to provide online code editors, then cloud based solution will work

bitnom commented 3 years ago

It's going to need to be remote for a desirable expanded model. No matter though. The latency (If service is properly scaled) can be acceptable/barely noticeable (If at all) under normal network conditions. I've considered doing it. I'm down for some brainstorming.

bentrevett commented 3 years ago

The issue with cloud-based solutions is dealing with security/privacy. Do people want to use a system where their entire codebase is uploaded to the cloud?

I know TabNine's cloud-based models are opt-in, though I don't know how many people take advantage of this.

traverseda commented 3 years ago

Cloudnine definitely does something to take advantage of the entire folder, event he first release (which had no cloud offering, didn't seem to make any network call except for the updates) was giving useful tab-completion on information that only existed in other folders.

According to their blog the model was trained on open source projects. There's definitely some secret sauce going on though.

bentrevett commented 3 years ago

Cloudnine definitely does something to take advantage of the entire folder, event he first release (which had no cloud offering, didn't seem to make any network call except for the updates) was giving useful tab-completion on information that only existed in other folders.

According to their blog the model was trained on open source projects. There's definitely some secret sauce going on though.

This shouldn't be too hard to replicate, right? Have a "generic" model trained on as many open-source projects as possible and then have an "initialization" phase where the model is fine-tuned on all the code in the current folder found via a recursive search.