StevenWingett / FastQ-Screen

Detecting contamination in NGS data and multi-species analysis
https://stevenwingett.github.io/FastQ-Screen/
GNU General Public License v3.0
64 stars 15 forks source link

Filtering was killed #65

Closed Eduardo-Auer closed 1 year ago

Eduardo-Auer commented 1 year ago

Hello, using the following command, I was filtering a compressed fastq file (.*fq.gz) for only unique or multiple hits on the human genome:

/mnt/d/dados_geneticos/nebula_genomics/FastQ-Screen-0.15.3/./fastq_screen --conf /mnt/d/dados_geneticos/nebula_genomics/FastQ-Screen-0.15.3/FastQ_Screen_Genomes/fastq_screen.conf --aligner bowtie2 --tag /mnt/d/dados_geneticos/nebula_genomics/fastq_screen/NG1U1B2ET6_1.fq.gz --filter 30000000000000 --outdir /mnt/d/dados_geneticos/nebula_genomics/fastq_screen/fastq_screen_filtered However, after a long time (at least 10h), it was killed and only a temporary file was left: "NG1U1B2ET6_1.fq.gz_temp_subset.fastq". I will post the log below:

Using fastq_screen v0.15.3
Reading configuration from '/mnt/d/dados_geneticos/nebula_genomics/FastQ-Screen-0.15.3/FastQ_Screen_Genomes/fastq_screen.conf'
Adding database Human
Adding database Mouse
Adding database Rat
Adding database Drosophila
Adding database Worm
Adding database Yeast
Adding database Arabidopsis
Adding database Ecoli
Adding database rRNA
Adding database MT
Adding database PhiX
Adding database Lambda
Adding database Vectors
Adding database Adapters
Using 7 threads for searches
Option --subset set to 0: processing all reads in FASTQ files
Processing NG1U1B2ET6_1.fq.gz
Not making data subset
Searching NG1U1B2ET6_1.fq.gz_temp_subset.fastq against Human
Killed

My computer has 32 GB memory, and I ran the command in a Mamba environment with bowtie2 installed. Could this issue caused by the lack of memory?

StevenWingett commented 1 year ago

Hi,

This does indeed sound as though your system ran out of memory. Could you try chunking your FASTQ file into sub-files, processing, and then combining the filtered FASTQ files? This will save memory.

Best, Steven