boyter / searchcode-server

The offical home of searchcode-server where you can run searchcode locally. Note that master is generally unstable in the sense that it is not a release. Check releases for release versions https://github.com/boyter/searchcode-server/releases
https://searchcodeserver.com/
Other
363 stars 93 forks source link

Duplicates detection #130

Open quasarea opened 7 years ago

quasarea commented 7 years ago

I think you are already collecting hashes of files, it would be useful to results in duplicates compacted into single result with multiple paths

boyter commented 7 years ago

This is something that I need in order to port searchcode.com over as well. I am still investigating adding a Simhash calculation in order to also mark similar files as duplicates. An example of how it is implemented there is,

https://searchcode.com/?q=jquery

Note that after a pause that a "Show 100 matches" pops in which when clicked displays the duplicates.

quasarea commented 7 years ago

Yep, that looks good, you had so much great stuff there ;p

boyter commented 7 years ago

Glad you think so. It certainly is a lot to port across!

boyter commented 7 years ago

Pushing this one out to 1.3.12