marschall-lab / gaftools

General purpose utility related to GAF files
https://gaftools.readthedocs.io/
MIT License
11 stars 0 forks source link

Issue running gaftools stat on gzip compressed file #13

Closed eblerjana closed 10 months ago

eblerjana commented 1 year ago

Hi,

when running gaftools stat (commit 2cacdd4) with a gzip compressed file, I get the following error:

Traceback (most recent call last):
  File "/home/jana/miniconda3/bin/gaftools", line 33, in <module>
    sys.exit(load_entry_point('gaftools', 'console_scripts', 'gaftools')())
  File "/home/jana/Downloads/test-gaftools/gaftools/gaftools/__main__.py", line 88, in main
    module.main(args)
  File "/home/jana/Downloads/test-gaftools/gaftools/gaftools/cli/stat.py", line 118, in main
    run_stat(**vars(args))
  File "/home/jana/Downloads/test-gaftools/gaftools/gaftools/cli/stat.py", line 50, in run_stat
    for alignment_count, mapping in enumerate(parse_gaf(gaf_path), 1):
  File "/home/jana/Downloads/test-gaftools/gaftools/gaftools/gaf.py", line 40, in parse_gaf
    gaf_file = gzip.open(filename,"r")
NameError: name 'gzip' is not defined

Maybe this is simply because of a missing import gzip?

asylvz commented 1 year ago

Thank you, I fixed it.

Can you check it and close the issue if it works fine?

Thanks, Arda

eblerjana commented 1 year ago

Thanks! I just tried the new version, the initial error is gone, but I'm getting a new error now:

Traceback (most recent call last):
  File "/home/jana/miniconda3/bin/gaftools", line 33, in <module>
    sys.exit(load_entry_point('gaftools', 'console_scripts', 'gaftools')())
  File "/home/jana/Downloads/test-gaftools/gaftools/gaftools/__main__.py", line 88, in main
    module.main(args)
  File "/home/jana/Downloads/test-gaftools/gaftools/gaftools/cli/stat.py", line 118, in main
    run_stat(**vars(args))
  File "/home/jana/Downloads/test-gaftools/gaftools/gaftools/cli/stat.py", line 50, in run_stat
    for alignment_count, mapping in enumerate(parse_gaf(gaf_path), 1):
  File "/home/jana/Downloads/test-gaftools/gaftools/gaftools/gaf.py", line 52, in parse_gaf
    for line in open(filename):
  File "/home/jana/miniconda3/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
tobiasmarschall commented 1 year ago

Sounds like an encoding issue. @eblerjana, do you have a small example file? @asylvz Would be nice to turn this into a test case.

asylvz commented 1 year ago

It was a bug in parsing the GAF apparent in all the .gz files. Now it should be fine, I fixed it. @tobiasmarschall, yes I agree.