Open mariadelmarq opened 4 years ago
Hello
It does not look like a metagenomic dataset: it is very small (both in terms of # of reads and the genome size), however the average coverage is very large. So, I would suspect there is something wrong with this dataset.
Thanks, @asl! Do you happen to know why metaspades picks up something weird in terms of the paired-end reads, whereas spades doesn't agree?
Weirdly enough, this dataset certainly claims to be metagenome data (https://www.ebi.ac.uk/ena/browser/view/PRJNA379494) and it forms the basis for a publication in Scientific Reports: https://www.nature.com/articles/s41598-017-06404-8.
Hi @mariadelmarq ! did you manage to solve your issue? I am facing the same situation. any ideas ?
@kmkappa Please do not hijack unrelated issues, open a new one
@asl as you prefer. please find the same problem occurred on my machine under #1110 issue
I'm trying to assemble some publicly available metagenomic data using metaspades. The data is here: https://www.omicsdi.org/dataset/omics_ena_project/PRJNA379494. I'm testing the first set of paired-end reads: SRR5351712_1 and SRR5351712_2.
I was able to assemble them using other assemblers with no issues, but when I run metaspades.py on them (
metaspades.py -1 SRR5351712_1.fastq.gz -2 SRR5351712_2.fastq.gz
), I get a series of warnings that suggest the paired-end reads are corrupted:I then tried with regular spades (
spades.py -1 SRR5351712_1.fastq.gz -2 SRR5351712_2.fastq.gz
), and get a different warning:Is it that the files are corrupted in a way that megahit, for example, is unable to pick up on, or is there a compatibility issue between these files and spades? I've tried both the raw files and trimmed files (using trimmomatic), same warnings in both cases.
Here are the log and param files for the metaspades assembly, let me know if you'd like me to send through the spades ones as well. params.txt spades.log
Thanks!