scanoss / engine

SCANOSS Open Source Inventory Engine
GNU General Public License v2.0
34 stars 20 forks source link

Is it possible to ignore comment? #43

Closed leoliaolei closed 1 year ago

leoliaolei commented 1 year ago

It seems the comments in Java and C are included in the similarity detection. It is possible to ignore all comments when do the comparing?

mscasso-scanoss commented 1 year ago

While we don't have a ready-made solution to ignore comments, there are multiple ways to address this, depending on your actual need:

leoliaolei commented 1 year ago

@mscasso-scanoss Thanks a lot for the details.

In my case, user does not care about the comment similarity. I tried approach one (to ignore call comments) . Firstly download source code to local files. Then remove comments with a shell script and run the minr on local files. This removes comments but it also removes comments in the .mz source code.

Is there some way to keep comments in .mz source archives, but remove comments for fingerprinting?

mscasso-scanoss commented 1 year ago

Hi @leoliaolei Sorry for the delay. Maybe you want to try HPSM (High Precision Snippet Matching. You can use HPSM as shared library with scanoss command to get an accurate line range. So, in your application, when the scanned files will be compared with the source file clean from comments you will get a matching line rage excluding the comments.