The GTDB lineage files are stored as gzip compressed files. rule rule make_contigs_search_taxonomy_wc fails with a gzipped lineage file:
Traceback (most recent call last):
File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/runpy.py", line 197, in _run_m
odule_as_main
return _run_code(code, main_globals, None,
File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/runpy.py", line 87, in _run_co
de
exec(code, run_globals)
File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/site-packages/charcoal/contigs_search_taxonomy.py", line 151, in <module>
returncode = cmdline(sys.argv[1:])
File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/site-packages/charcoal/contigs
_search_taxonomy.py", line 146, in cmdline
return main(args)
File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/site-packages/charcoal/contigs_search_taxonomy.py", line 27, in main
tax_assign, _ = load_taxonomy_assignments(args.lineages_csv,
File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/site-packages/sourmash/lca/com
mand_index.py", line 39, in load_taxonomy_assignments
first_row = next(iter(r))
File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/codecs.py", line 322, in decod
e
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
It would be super convenient to allow for gzipped lineage csv files.
The GTDB lineage files are stored as gzip compressed files. rule
rule make_contigs_search_taxonomy_wc
fails with a gzipped lineage file:It would be super convenient to allow for gzipped lineage csv files.