KarchinLab / open-cravat

A modular annotation tool for genomic variants
MIT License
113 stars 27 forks source link

UTF error when submitting multiple 23andMe files #144

Closed RachelKarchin closed 1 year ago

RachelKarchin commented 1 year ago

My 23andMe files run fine when I submit only one, but when I submit several I get this error:

2023/03/31 21:45:58 cravat An unexpected exception occurred. Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/cravat/cravat_class.py", line 532, in main self.run_converter() File "/usr/local/lib/python3.8/site-packages/cravat/cravat_class.py", line 1273, in run_converter self.numinput, self.converter_format = converter.run() File "/usr/local/lib/python3.8/site-packages/cravat/cravat_convert.py", line 398, in run self.setup() File "/usr/local/lib/python3.8/site-packages/cravat/cravat_convert.py", line 229, in setup self._select_primary_converter() File "/usr/local/lib/python3.8/site-packages/cravat/cravat_convert.py", line 320, in _select_primary_converter if not self.primary_converter.check_format(f): File "/usr/local/lib/python3.8/site-packages/cravat/modules/converters/23andme-converter/23andme-converter.py", line 24, in check_format return '23andMe' in f.readline() File "/usr/local/lib/python3.8/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Example: https://run.opencravat.org/submit/jobs/230331-213732/log

kmoad commented 1 year ago

It's likely that one of the input files is gzipped. The second byte in gzip files is always 0x8b. OC doesn't support gzipped 23andme files currently.

RachelKarchin commented 1 year ago

That is what is going on! One of the input files was gzipped but did not have a .gz extension.