ncsa / NEAT

NEAT (NExt-generation Analysis Toolkit) simulates next-gen sequencing reads and can learn simulation parameters from real data.
Other
37 stars 12 forks source link

'NoneType' object has no attribute 'quality_scores' #97

Closed ananya-0729 closed 3 months ago

ananya-0729 commented 4 months ago

Hi, I'm encountering a problem when trying to simulate read data: The content of neat_config.yml is as listed, reference: ecoli_genome.fna read_len: 250 threads: 8 coverage: 100 avg_seq_error: 0.0

2024-03-05 15:44:37,426:INFO:neat.read_simulator.runner:Using configuration file template_neat_config.yml
2024-03-05 15:44:37,426:INFO:neat.read_simulator.runner:Saving output files to .
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Run Configuration...
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Input fasta: Candida_genome.fasta
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Producing the following files:
        - /home/igib/read_simu/NEAT/candida_simulated.fastq.gz

2024-03-05 15:44:37,429:WARNING:neat.read_simulator.utils.options:Multithreading coming soon!!
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Single threading - 1 thread.
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Running in single-ended mode.
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Using a read length of 250
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Average coverage: 100
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Using default error model.
2024-03-05 15:44:37,429:INFO:neat.read_simulator.utils.options:Ploidy value: 2
2024-03-05 15:44:37,430:INFO:neat.read_simulator.utils.options:RNG seed value for run: 6028174716339031
2024-03-05 15:44:37,430:INFO:neat.read_simulator.runner:Reading Models...
2024-03-05 15:44:37,430:INFO:neat.read_simulator.runner:Reading Candida_genome.fasta.
2024-03-05 15:44:37,932:INFO:neat.read_simulator.runner:Beginning simulation.
2024-03-05 15:44:38,063:INFO:neat.read_simulator.runner:Generating variants for NC_072812.1
2024-03-05 15:44:38,100:ERROR:neat:read-simulator failed, see the traceback below
Traceback (most recent call last):
  File "/home/igib/miniforge3/envs/neat/lib/python3.10/site-packages/neat/cli/cli.py", line 133, in main
    cmd(args)
  File "/home/igib/miniforge3/envs/neat/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute
    read_simulator_runner(arguments.config, arguments.output)
  File "/home/igib/miniforge3/envs/neat/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 332, in read_simulator_runner
    seq_error_model_2.quality_scores),
AttributeError: 'NoneType' object has no attribute 'quality_scores'
ERROR: read-simulator failed, showing the last error
Traceback (most recent call last):
  File "/home/igib/miniforge3/envs/neat/lib/python3.10/site-packages/neat/cli/cli.py", line 133, in main
    cmd(args)
  File "/home/igib/miniforge3/envs/neat/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute
    read_simulator_runner(arguments.config, arguments.output)
  File "/home/igib/miniforge3/envs/neat/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 332, in read_simulator_runner
    seq_error_model_2.quality_scores),
AttributeError: 'NoneType' object has no attribute 'quality_scores'

Thanks in advance!

joshfactorial commented 4 months ago

Not sure what branch this was on or what the config was, but try pulling the latest from Main. We pulled in some bug fixes that may have this covered. If not, please post some details like the options you were using so we can try to replicate.

ananya-0729 commented 3 months ago

Not sure what branch this was on or what the config was, but try pulling the latest from Main. We pulled in some bug fixes that may have this covered. If not, please post some details like the options you were using so we can try to replicate.

I've added the parameters of neat_config.yml. The content of neat_config.yml is as listed: reference: ecoli_genome.fna read_len: 250 threads: 8 coverage: 100 avg_seq_error: 0.0

joshfactorial commented 3 months ago

All right, I will try to replicate this week!

joshfactorial commented 3 months ago

We're working on a fix for this. I tested against E. coli with a coverage depth of 200 and read length of 250 on one of the compute nodes at NCSA and it ran for 10 hours and still didn't finish. Once we get the fix finalized, the only thing I'll add is that currently, our threading feature isn't working, but that is next on the to-do list. I would recommend trying to run NEAT several times on the same file and combine the results together to get to 200 depth coverage, rather than trying to run it all in one go. I think it will end up running faster that way.

joshfactorial commented 3 months ago

I believe this is fixed!