Open barnslig opened 9 years ago
I implemented this feature in my FTP crawler written in Ruby. It might help you: https://github.com/digineo/media_crawler/blob/v1/app/models/resource/chunk.rb https://github.com/digineo/media_crawler/blob/v1/app/models/resource/metadata.rb
Load a chunk of 50kB / 1MB or whatever makes sense, get metadata with ffmpeg and create a checksum for the duplicate-recognition. This could give us interesting data for the search, especially when searching for music.
However, because this feature is really network-sucking and load-generating, we should do this as last thing as the filename and path should give us enough for a good search most times.