cvmfs_search doesn't work on chunks

DrDaveD commented 6 months ago

I wanted to use cvmfs_search on a hash file that ends in 'P'. That's a "partial" chunk of a larger file, which cvmfs generates when files are larger than about 8MB. It failed with

Traceback (most recent call last):
  File "/usr/kerberos/bin/cvmfs_search", line 142, in <module>
    names = search_cvmfs_repo(args.url, args.db_index)
  File "/usr/kerberos/bin/cvmfs_search", line 109, in search_cvmfs_repo
    content_hash = bytes.fromhex(data_hash)
ValueError: non-hexadecimal number found in fromhex() arg at position 40

because it couldn't convert that 'P'. I added code to remove the 'P', but then it didn't find the hash. I believe that's because those hashes are in a different table called chunks. You can see an example catalog.py. I'm thinking that the hashes from the chunks table could be also added to the locally-created catalog table since we only care about doing the reverse lookup. Could you look into it @Vbitz?

Vbitz commented 6 months ago

Sure happy to look into that.

DrDaveD commented 5 months ago

Have you been able to figure out how to do this?

cvmfs-contrib / python-cvmfsutils

cvmfs_search doesn't work on chunks #35