Mondego / SourcererCC

Sourcerer's Code Clone project
GNU General Public License v3.0
202 stars 69 forks source link

How to create the clone mapping in C or C++? #50

Closed ztwater closed 2 years ago

ztwater commented 2 years ago

SourcererCC is a great tool for clone mapping! However, I still have several questions about it and DejaVu.

You said 'we have created a mapping between file clones in four languages: Java, C++, JavaScript and Python.' in your website, and I am interested in finding code clones among C++ code files. However, when doing the tokenizing, I haven't found a file named extractCFunction.py yet to finish parsing.

By the way, can I find block clones without the method structures? For instance, for a few lines of statement?

Thank you very much!

crista commented 2 years ago

Yes. Use this tokenizer instead: https://github.com/Mondego/SourcererCC/tree/master/tokenizers/file-level It doesn't parse anything; it's a text tokenizer that considers each file as a unit to be compared, so you can compare files in any language. It's highly customizable.

ztwater commented 2 years ago

@crista Thanks for your reply! I will try this tokenizer out.