wwood / CoverM

Read coverage calculator for metagenomics
GNU General Public License v3.0
273 stars 30 forks source link

Allow gzipped fasta files to be provided as input #145

Open rhysnewell opened 1 year ago

rhysnewell commented 1 year ago

Hey Ben,

currently when providing gzipped fasta files to coverm spits out the following error:

coverm genome -1 SRR16554771.downsampled.1.fastq.gz -2 SRR16554771.downsampled.2.fastq.gz -f MIC9243/PC_RaymondF_2016__P20E90__bin.24.fna.gz
[2022-12-19T03:34:47Z INFO  coverm] CoverM version 0.6.1
[2022-12-19T03:34:47Z INFO  coverm] Using min-covered-fraction 10%
[2022-12-19T03:34:47Z INFO  bird_tool_utils::external_command_checker] Found minimap2 version 2.24-r1122
[2022-12-19T03:34:47Z INFO  bird_tool_utils::external_command_checker] Found samtools version 1.10
[2022-12-19T03:34:47Z INFO  coverm] Profiling 1 genomes
[2022-12-19T03:34:47Z INFO  coverm] Generating concatenated reference FASTA file of 1 genomes ..
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: InvalidData, message: "stream did not contain valid UTF-8" }', /home/areej_alsheikh_microba_com/.cargo/registry/src/github.com-1ecc6299db9ec823/coverm-0.6.1/src/mapping_index_maintenance.rs:226:32
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This can be fixed by migrating over to needletail for fastx parsing rather than using rust-htslib

Cheers, Rhys

wwood commented 1 year ago

Hey, good point. It already works for contig mode. Should change.