iedmrc / galois-autocompleter

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.
https://usegalois.com
MIT License
95 stars 26 forks source link

train code complete from zero #2

Closed yuandaxing closed 4 years ago

yuandaxing commented 4 years ago

Hi, I try to train code complete from zero, below is two settings, and the performance is bad:

  1. train from utf-8 encode, the default BPE encoding, 1 batch_size, 3 GPU card, 500K iteration
  2. train from ascii encode, 1 batch_size, 3 GPU card, 500K iteration and method 2 is just map ascii from 1 to N, the size is much less, and the performance of the two settings ard bad. do you train by finetune the release model of gpt2? can you share your training settings?
iedmrc commented 4 years ago

I should say that, currently, Galois is not at its best stage. I made just a little bit finetuning works but still, results are not bad. These are the configs/parameters that I used:

Nowadays, I suggest who wants to train a generative model to use huggingface's transformers library.

yuandaxing commented 4 years ago

great thanks for your detailed help