Open alakhalil opened 1 year ago
Hi @alakhalil , Thank you for your interest in our work!
Can you try without the --framework keras
flag?
Best, Uri
Hi @alakhalil , Thank you for your interest in our work!
Can you try without the
--framework keras
flag?Best, Uri
Thank you @urialon for the quick reply! yes now it works. so with the current implementation to be able to further train the model, one needs to use TensorFlow model instance, right? any suggestions from where to start to enable loading the weights for Keras model instance too?
Regards, Alaa
Hi Alaa, Yes, you can further train the model using the TensorFlow pipeline, I do not recommend using the Keras version.
However, notice that this code was written long ago, and since then we have newer models. If you are interested in code classification, we have several BERT models for several programming languages here: https://github.com/neulab/code-bert-score#huggingface--models These are as easy to load as:
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("neulab/codebert-python")
model = AutoModelForMaskedLM.from_pretrained("neulab/codebert-python")
If you are interested in code completion or NL->Code, check out PolyCoder at: https://github.com/VHellendoorn/Code-LMs#october-2022---polycoder-is-available-on-huggingface. It can be loaded using:
from transformers import AutoTokenizer, AutoModelForCausalLM
from packaging import version
assert version.parse(transformers.__version__) >= version.parse("4.23.0")
tokenizer = AutoTokenizer.from_pretrained("NinedayWang/PolyCoder-2.7B")
model = AutoModelForCausalLM.from_pretrained("NinedayWang/PolyCoder-2.7B")
and in this series there are also smaller models here: https://huggingface.co/NinedayWang.
Best, Uri
Hi Alaa, Yes, you can further train the model using the TensorFlow pipeline, I do not recommend using the Keras version.
However, notice that this code was written long ago, and since then we have newer models. If you are interested in code classification, we have several BERT models for several programming languages here: https://github.com/neulab/code-bert-score#huggingface--models These are as easy to load as:
from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("neulab/codebert-python") model = AutoModelForMaskedLM.from_pretrained("neulab/codebert-python")
If you are interested in code completion or NL->Code, check out PolyCoder at: https://github.com/VHellendoorn/Code-LMs#october-2022---polycoder-is-available-on-huggingface. It can be loaded using:
from transformers import AutoTokenizer, AutoModelForCausalLM from packaging import version assert version.parse(transformers.__version__) >= version.parse("4.23.0") tokenizer = AutoTokenizer.from_pretrained("NinedayWang/PolyCoder-2.7B") model = AutoModelForCausalLM.from_pretrained("NinedayWang/PolyCoder-2.7B")
and in this series there are also smaller models here: https://huggingface.co/NinedayWang.
Best, Uri
Hi Uri,
I see. Thank you for the clarification.
Regards, Alaa
Hello,
I am trying to evaluate the model by testing the trained model on small-java dataset.
wget https://s3.amazonaws.com/code2vec/model/java14m_model.tar.gz tar -xvzf java14m_model.tar.gz
wget https://s3.amazonaws.com/code2vec/data/java-small_data.tar.gz
I tried to test the trained model via
python3 code2vec.py --load models/java14_model/saved_model_iter8.release --test data/java-small/java-small.test.c2v --framework keras
But I got the error
ValueError: There is no entire model to load at path models/java14_model/saved_model_iter8.release__entire-model, and there is no model weights file to load at path models/java14_model/saved_model_iter8.release__only-weights.
How the entire-model and weight can be generated?
Kind regards,