JetBrains-Research / astminer

A library for mining of path-based representations of code (and more)
MIT License
284 stars 80 forks source link

"No such file or directory" error while parsing C++ code #217

Open Ytz-Ichi opened 2 years ago

Ytz-Ichi commented 2 years ago

Hi, I am not fluent in English so I am using DeepL, so please forgive me if I am using strange expressions. I have extracted some C++ source code in IBM/Project_CodeNet and am trying to parse them, but the following error message appears frequently and finally stops with an error.

$ ./cli.sh fuzzy.yaml
Docker image not found, will use build/shadow/astminer.jar
Working in 4 thread(s)
Parsing Cpp
420898 file(s) found
rm: __tmp_code.cpp: No such file or directory
rm: __tmp_include.cpp: No such file or directory
rm: __tmp_preprocessed.cpp: No such file or directory
rm: __tmp_code.cpp: No such file or directory
rm: __tmp_include.cpp: No such file or directory
rm: __tmp_preprocessed.cpp: No such file or directory
rm: __tmp_code.cpp: No such file or directory
rm: __tmp_include.cpp: No such file or directory
rm: __tmp_preprocessed.cpp: No such file or directory

The curious thing about this error is that it does not occur on initial startup, but only after an attempt to abort with Ctrl+C for convenience. Once this error occurs, the same error occurs on the same machine, whether reinstalling astminer or creating a completely new environment by unzipping the dataset, etc.

I have searched Google and the issues here as much as possible, but have been unable to find a way to reach a solution. Any advice you can give me on how to solve this problem would be greatly appreciated!

Ytz-Ichi commented 2 years ago

Config file.

# input directory (path to project)
inputDir: ../CodeNetExt
# output directory
outputDir: output

parser:
  name: fuzzy
  languages: [cpp]

filters:
  - name: by function name length
    maxWordsNumber: 10
  - name: by words number
    maxTokenWordsNumber: 50

label:
  name: function name

storage:
  name: code2vec
  maxPathLength: 9
  maxPathWidth: 2

numOfThreads: 4