broadinstitute / variant-curation-portal

Web application for curating loss of function variants
https://lof.curation.broadinstitute.org
MIT License
4 stars 1 forks source link

TypeError: can't concat str to bytes #181

Closed imrannibmg closed 4 years ago

imrannibmg commented 4 years ago

I tried converting VCF to JSON using the script. Resulted error.

Traceback (most recent call last): File "../convert_vcf_to_json.py", line 218, in main() File "../convert_vcf_to_json.py", line 213, in main tag_fields=tag_fields, File "../convert_vcf_to_json.py", line 69, in convert_vcf_to_json reader = vcf.Reader(vcf_file) File "/usr/local/lib/python3.7/site-packages/vcf/parser.py", line 300, in init self._parse_metainfo() File "/usr/local/lib/python3.7/site-packages/vcf/parser.py", line 317, in _parse_metainfo line = next(self.reader) File "/usr/local/lib/python3.7/site-packages/vcf/parser.py", line 280, in self.reader = (line.strip() for line in self._reader if line.strip()) File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codecs.py", line 645, in next line = self.readline() File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codecs.py", line 558, in readline data = self.read(readsize, firstline=True) File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/codecs.py", line 498, in read newdata = self.stream.read(size) File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/gzip.py", line 276, in read return self._buffer.read(size) File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_compression.py", line 68, in readinto data = self.read(len(byte_view)) File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/gzip.py", line 463, in read if not self._read_gzip_header(): File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/gzip.py", line 406, in _read_gzip_header magic = self._fp.read(2) File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/gzip.py", line 91, in read self.file.read(size-self._length+read) TypeError: can't concat str to bytes

Please help me out....

nawatts commented 4 years ago

Hi @imrannibmg,

It looks like this was happening for VCF files with a .gz extension, but not .bgz. This is because PyVCF constructs a GzipFile for file objects opened from a .gz file.

https://github.com/jamescasbon/PyVCF/blob/476169cd457ba0caa6b998b301a4d91e975251d9/vcf/parser.py#L256-L270

However, this is not necessary because convert_vcf_to_json.py handles compression outside of the PyVCF reader. The file object passed to the VCF Reader is created by gzip.open.