Open WQR53 opened 2 years ago
Hi @WQR53 , Thank you for your interest in our work!
I don't know the reason, since astminer and neuro-vectorizer are not mine.
However, please check out this PolyCoder paper: https://arxiv.org/pdf/2202.13169.pdf and code: https://github.com/VHellendoorn/Code-LMs where we release a larger model that works for many languages. Specifically, for C, PolyCoder achieves better results than OpenAI's Codex.
Best, Uri
I use astminer to generate C data for feeding code2vec. And the dataset is from https://github.com/intel/neuro-vectorizer . However, PRECISION, RECALL, and F1 are always zero when training. I use the command
source train.sh
to run train.sh and the following output was obtained (partially).I used astminer to get path_contexts.c2s file and divided it into three files train.c2s, test.c2s and val.c2s. Next, I modified the file preprocess.sh and got 7 c2v files: xxxx.dict.c2v, xxxx.histo.ori.c2v, xxxx.histo.path.c2v, xxxx.histo.tgt.c2v, xxxx.test.c2v, xxxx.train.c2v, xxxx.val.c2v. And then I used the command
source train.sh
to run train.sh but found that PRECISION, RECALL, and F1 were all 0.