JohnDoee / autotorrent

Matches torrents with files and gets them seeded
MIT License
269 stars 34 forks source link

Scan mode: Hash by brute force #4

Closed Joshfindit closed 9 years ago

Joshfindit commented 9 years ago

With scan modes now separately supporting incomplete files with exact filename, and complete files with different filename but same size, it opens the door to bypassing files that are close to 100% and have a different filename.

As an advanced search tool by using a command-line argument, tell autotorrent to scan all chunks, regardless of filename or file size.

Presumably, this would be run on very specific subfolders containing data for a single torrent.

JohnDoee commented 9 years ago

True brute force does not seem that possible as it would require too much hash-checking (~140TB of hash-checking with a piece size of 4MB).

Best I can offer is to expect the file either starts or ends in the correct place. The start and end data does not HAVE to be correct, just have to have the correct length. It is a bit of an alignment issue.

Joshfindit commented 9 years ago

Hmm, I see your point. What I had theorized is to only load a single torrent, bypass the _autotorrent.db, load the hashes from the torrent (which, if memory serves, also stores the piece size), and then check all pieces in the folder for matching hashes, regardless of filename or filesize.

Understandably, one should never run this against even 100GB of data, let alone 100TB. I assume it would be more for individual albums.

But, with the piece alignment issue, this may be a blue-sky request.

JohnDoee commented 9 years ago

The result is to try and hash-check from start or end of file and then just rewrite the file to align.