Closed ztwater closed 2 years ago
Yes. Use this tokenizer instead: https://github.com/Mondego/SourcererCC/tree/master/tokenizers/file-level It doesn't parse anything; it's a text tokenizer that considers each file as a unit to be compared, so you can compare files in any language. It's highly customizable.
@crista Thanks for your reply! I will try this tokenizer out.
SourcererCC is a great tool for clone mapping! However, I still have several questions about it and DejaVu.
You said 'we have created a mapping between file clones in four languages: Java, C++, JavaScript and Python.' in your website, and I am interested in finding code clones among C++ code files. However, when doing the tokenizing, I haven't found a file named extractCFunction.py yet to finish parsing.
By the way, can I find block clones without the method structures? For instance, for a few lines of statement?
Thank you very much!