J35P312 / SVDB

structural variant database software
MIT License
38 stars 16 forks source link

UnicodeDecodeError #66

Closed yangyxt closed 1 year ago

yangyxt commented 1 year ago

I tried to query AF and AN from an in-house built SVDB.

The run returned a UnicodeError listed below: Traceback (most recent call last): File "/venv/bin/svdb", line 33, in <module> sys.exit(load_entry_point('svdb', 'console_scripts', 'svdb')()) File "/home/worker/app/svdb/__main__.py", line 99, in main make_query_calls(args, queries, "db") File "/home/worker/app/svdb/__main__.py", line 46, in make_query_calls query_module.main(args) File "svdb/query_module.py", line 73, in svdb.query_module.main File "svdb/query_module.py", line 74, in svdb.query_module.main File "/usr/local/lib/python3.8/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf9 in position 31: invalid start byte

I tried different query vcfs and all returned this error. May I know is this invalid start byte existed in the DB file or in the query VCF file. If it is the DB file, what could be wrong building it?

Much appreciated if you could take a look at this issue.

yangyxt commented 1 year ago

To be clearer, I tried to build the DB with plain text VCF file and query the DB with plain text VCF file but the unicode error keeps showing. The query VCF, and build VCFs are all preprocessed by bcftools.

I don't know what went wrong. But please tell me if that's something I did to make this happen.

J35P312 commented 1 year ago

Hello! And sorry for slow reply, I was away during new year. It seems you have gotten some unusual character in your vcf! The symbol 0xf9 is not present in UTF-8, therefore you get this error.

Could it be that you have added some annotation containing this charachter? Or maybe it is in a sample name?

yangyxt commented 1 year ago

@J35P312 , Thank you for your reply! I just found out that the issue was caused because I made a silly mistake. I used the argument --db to specify a path to a sqlite database file built from svdb build. I looked through the manual again and found another argument --sqdb which works find with the sqlite db file. Sorry for the trouble here.