ncsa / NEAT

NEAT (NExt-generation Analysis Toolkit) simulates next-gen sequencing reads and can learn simulation parameters from real data.
Other
37 stars 12 forks source link

Empty fastq file when running neat read-simulator + default config #108

Closed dani-ture closed 1 month ago

dani-ture commented 1 month ago

Describe the bug After running neat read-simulator and decompressing the fastq.gz file, I get an empty fastq file.

To Reproduce Steps to reproduce the behavior:

  1. Download the E. coli NCBI RefSeq assembly from the following link: https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000005845.2/
  2. Make a copy of the provided template config file (I called it test_config.yml) and set the parameters: ‘’’reference: GCF_000005845.2_ASM584v2_genomic.fna ploidy: 1 overwrite_output: true’’’ The rest are left with the “.” as default.
  3. Run neat on the command line: neat --log-name test --log-detail HIGH --log-level DEBUG read-simulator -c test_config.yml -o test
  4. Decompress the fastq.gz file: gunzip test.fastq.gz
  5. Check that test.fastq file is empty

Expected behavior I expected to have the simulated reads in test.fastq

Screenshots There are neither warnings nor errors, the info and debug lines appeared in stdout and it looks like the file is detected correctly, neat displays the count of bases in the genome...Maybe the issue lies in the process of writting to the file.

Desktop (please complete the following information):

joshfactorial commented 1 month ago

I tested this on the new version (4.1.2), and I ran into a slightly different problem. It produced data in the fastq, but only like 80 records. I will investigate this. There's something not working right about fastq generation.

joshfactorial commented 1 month ago

I'll be adding this to things fixed in version 4.2. On my latest test, I ran against ecoli with coverage 10 ploidy 1 read length 151 and it came out to about 300,000 reads, which should be right. I just need to iron it out and that should be ready soon.

dani-ture commented 1 month ago

I also tested the version 4.1.2. and got 160 reads (which is definitely too low). 300,000 looks like a similar output you would get from a sequencing experiment with that coverage. Thank you so much for the support and I'm looking forward to trying the new version!

joshfactorial commented 1 month ago

It should be ready this week!


From: Daniel Turégano @.> Sent: Thursday, May 23, 2024 5:47 AM To: ncsa/NEAT @.> Cc: Allen, Josh @.>; Assign @.> Subject: Re: [ncsa/NEAT] Empty fastq file when running neat read-simulator + default config (Issue #108)

I also tested the version 4.1.2. and got 160 reads (which is definitely too low). 300,000 looks like a similar output you would get from a sequencing experiment with that coverage. Thank you so much for the support and I'm looking forward to trying the new version!

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/ncsa/NEAT/issues/108*issuecomment-2126801297__;Iw!!DZ3fjg!9TJgIgXOq_rOs9T_sHV5QGcAyzGhl4KLNuBg-djlGUG2rVmkbXPNiC6MF7wXgspt-WIXlU2MTIYaU8Xl2crtw-zbuURjPg$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AGMI7243IK7ARF6DZYUMEA3ZDXCMVAVCNFSM6AAAAABHWDD4EWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRWHAYDCMRZG4__;!!DZ3fjg!9TJgIgXOq_rOs9T_sHV5QGcAyzGhl4KLNuBg-djlGUG2rVmkbXPNiC6MF7wXgspt-WIXlU2MTIYaU8Xl2crtw-wLOhR39Q$. You are receiving this because you were assigned.Message ID: @.***>