poeml / mirrorbrain

MirrorBrain
http://mirrorbrain.org/
Other
75 stars 37 forks source link

invalid utf8 byte sequence error during makehashes #170

Closed rworkman closed 7 years ago

rworkman commented 7 years ago

This only happens with this one file - if I skip that file using a regex match (-i flag) then the makehash operation completes successfully). I can of course skip this file without much consequence, but I'd like to find out why it's happening and how to prevent it in case other files trigger it later.

The "0" printed at the end of the line is from me uncommenting the "print len(self.zsums)" line in hashes.py

File 'slackware-14.0/source/a/lilo/lilo-23.2.tar.gz' not in database. Not on mirrors yet? Will be inserted. Hashing '/home/ftp/pub/slackware/slackware-14.0/source/a/lilo/lilo-23.2.tar.gz'... 0 done. Traceback (most recent call last): File "/usr/bin/mb", line 1677, in r = mirrordoctor.main() File "/usr/lib64/python2.7/site-packages/cmdln.py", line 262, in main return self.cmd(args) File "/usr/lib64/python2.7/site-packages/cmdln.py", line 285, in cmd retval = self.onecmd(argv) File "/usr/lib64/python2.7/site-packages/cmdln.py", line 423, in onecmd return self._dispatch_cmd(handler, argv) File "/usr/lib64/python2.7/site-packages/cmdln.py", line 1124, in _dispatch_cmd return handler(argv[0], opts, *args) File "/usr/bin/mb", line 1134, in do_makehashes force=opts.force) File "/usr/lib64/python2.7/site-packages/mb/hashes.py", line 171, in check_db binascii.hexlify(''.join(self.hb.zsums))] psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0x8b

rworkman commented 7 years ago

Well, I think this is a false alarm due to a corrupted file on the master mirror. That initial "8B" certainly looks suspicious given the error message thrown by mb...

$ pwd /mirrors/ftp.slackware.com/pub/slackware/slackware-14.0/source/a/lilo $ less lilo-23.2.tar.gz.asc "lilo-23.2.tar.gz.asc" may be a binary file. See it anyway? ^8B>^H^@^@^@^@^@^@^Cmι^N<820^@80><E1><BD>O<D1>Qc^@94j<E2PVPESCR܈T, $^\<8A><?<9F>tLL mbCD7>$^Ty<BE><83><BF>[^@^A<ABj^^V^KH<8A>&!<8B><9A>(<8F><80>PDA><F1><A2><ED><86>^@p^\b<A4c^Cp<8D>2%oKd$<96>^?֥[m6<81><8D>^Lf8:$W/N~<9A>u`dN&e^Rw>RR5<9C>9Km"^L<99>^B~"LW<9C>o<87>!^@^@^@ lilo-23.2.tar.gz.asc lines 1-2/2 (END)

rworkman commented 7 years ago

Yep, definitely the culprit. Closing this bug.