-
在测试lang=php时,进程会终止
- INFO - __main__ - Saving features into cached file ./train_valid/php/cached_train_train_codebert-base_200_codesearch
Killed
并且根据路径找到的php的cached_train文件大小时0kb
我看网上说时内存问题,我已经将…
-
Hi there team code2vec,
I am working on a personal project. My aim is to store a Java codebase in a vector database to run similarity searches and retrieve code files from the db relevant to my que…
-
Hi @NougatCA,
I'm trying to understand from where did you get the models/tokenizers and here is the breakdown of all models evaluated in the empirical study and listed in the [pre-print](https://ar…
-
As stated in the UniXcoder paper, "the model takes comment and flattened AST as the input". Please, I have the following questions:
1- Is the implementation of the AST parser of UniXcoder available…
-
- look for open source
- test the graph encoder
-
作为一个小白,我想知道,如果给定一段描述和一系列代码段,如何从这一系列代码段中找到最符合描述的那一个?这个问题我搜遍全网都没有获得答案,不知道是否可以给一个示例
-
I would like to get token embedding for python and java code.
I would be curious to know what the authors think about this?
Will the tokens for python code tokens will be meaning full?
-
Hi, we are developing new metrics to compare the metrics on the outputs of the translation task using GraphCodeBERT. Can anyone provide me with the `GraphCodeBERT`'s outputs and the reference files on…
-
Hi
Could you provide a code as an example of how to prepare data to be passed to the CodeReviewer model?
Is the marker for "old_file" ? So the text provided to the tokenizer should look something li…
-
Hi, about the the CoSQA dataset and how to use it, I have a few questions:
1.The Table4 in the paper shows:
There are 20604 queries and 6276 codes. Why is the number of code and query inconsiste…