Open messiGao opened 1 year ago
Hi @messiGao , Thank you for your interest in our work.
I think there is a confusion, because the exception that is raised is coming from TensorFlow, while the java command that you mentioned does not involve TensorFlow at all.
May I also ask what kinds of tasks are you looking into? Maybe I can recommend a newer model.
Best, Uri
I want to use the “--test” command to export
Additionally,My aim is to store a Java codebase in a vector database to run similarity searches and retrieve code files from the db relevant to my query.
Hi @messiGao ,
Please see https://github.com/neulab/code-bert-score
You don't need the approach itself, but it contains Huggingface models, and one specifically for java called neulab/codebert-java
.
This will allow you to use the Huggingface library with that model and a BERT-like framework.
Best, Uri
I have a similar dilemma with regards to creating embeddings of csharp code using a code2vec model I have trained. As
@messiGao mentioned, I want to use the "--test" command to create
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expect 201 fields but have 2 in record
[[node IteratorGetNext (defined at /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/framework/ops.py:1751) ]]```
Hi @asyed79gatech , Thank you for your interest in our work.
I believe that you haven't run the preprocess.sh
script on the data.
However in general, I recommend using the newer https://github.com/neulab/code-bert-score project. It is based on Huggingface, which is actively maintained.
Best, Uri
Hi @urialon
Thanks for your prompt response. I thought we only needed to run the preprocess.sh script while training the code2vec model. Right now, I already have a trained model released and want it to generate embeddings for vector store.
我使用像“{java -cp JavaExtractor-0.0.1-SNAPSHOT.jar JavaExtractor.App --max_path_length 8 --max_path_width 2 --dir test.java >file.txt }”这样的命令,然后使用“{python3 code2vec.py --load models/java14_model/saved_model_iter8.release --test file.txt}”,但出现错误“ {return tf_session。TF_SessionRun_wrapper(self._session、选项、feed_dict、tensorflow.python.framework.errors_impl。InvalidArgumentError:预期有 201 个字段,但记录中有 4 个字段 [[{{node IteratorGetNext}}]] }“。
Hello, have you resolved your issue? How can Java source code be converted into the input format required by code2vec?
我使用像“{java -cp JavaExtractor-0.0.1-SNAPSHOT.jar JavaExtractor.App --max_path_length 8 --max_path_width 2 --dir test.java >file.txt }”这样的命令,然后使用“{python3 code2vec .py --load models/java14_model/saved_model_iter8.release --test file.txt}”,但出现错误“ {return tf_session。TF_SessionRun_wrapper(self._session、选项、feed_dict、tensorflow.python.framework.errors_impl。InvalidArgumentError:预期有 201 个字段,但记录有 4 个字段 [[{{node IteratorGetNext}}]] }“。
您好,您的问题解决了吗?Java 源代码如何转换成 code2vec 所需的输入格式?
hello, I encountered the same issue. Have you resolved it?
I use command like “{java -cp JavaExtractor-0.0.1-SNAPSHOT.jar JavaExtractor.App --max_path_length 8 --max_path_width 2 --dir test.java >file.txt }“ ,then use ”{python3 code2vec.py --load models/java14_model/saved_model_iter8.release --test file.txt}“,but get error “ {return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict, tensorflow.python.framework.errors_impl.InvalidArgumentError: Expect 201 fields but have 4 in record [[{{node IteratorGetNext}}]] }”.