Closed NuriaQueralt closed 3 months ago
Apologies, this is a bug we're currently working through.
Thank you for the fast reply! Looking forward to reruning it once this is fixed.
can be 'skipped' temporarily adding
overwrite_output: True
to the config.yml
also there is a typo in the README for the Targeted region simulation
[contents of neat_config.yml]
reference: hg19.fa
read_len: 126
produce_bam: True
produce_vcf: True
paired_ended: True
fragment_mean: 300
fragment_st_dev: 30
targed_bed: hg19_exome.bed
should be
target_bed: hg19_exome.bed
however it looks like NEAT will generate data for all the reference, not only for the bed file, am I correct?
if I set the reference to a FASTA file for the region in the bed file, it will run but I will get another error
self.errors = err_model.get_sequencing_errors(self.length, self.reference_segment,
File ".venv/lib/python3.10/site-packages/neat/models/models.py", line 413, in get_sequencing_errors
if self.rng.random() < self.quality_score_error_rate[quality_scores[i]]:
KeyError: <generator object bin_scores at 0x7f7941f53530>
however it looks like NEAT will generate data for all the reference, not only for the bed file, am I correct?
By default the variants will be concentrated in the bed file areas, but there will still be some in the background (as well as sequencing errors). You can use the parameter off_target_scalar to adjust this. If you want no variants outside the bed, then you can set this to 0.0.
if I set the reference to a FASTA file for the region in the bed file, it will run but I will get another error
Yeah, same, that's why this bug fix is taking me a minute.
In the mean time you can try version 3.2. Apologies for the broken release...
ok, thanks, just for your info, I manage to skip the pervious error adding
avg_seq_error: 0
to the config.yml but then I get a different error,
2023-10-31 18:53:43,982:ERROR:neat:read-simulator failed, see the traceback below
Traceback (most recent call last):
File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/cli/cli.py", line 133, in main
cmd(args)
File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute
read_simulator_runner(arguments.config, arguments.output)
File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 353, in read_simulator_runner
generate_reads(local_reference,
File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/utils/generate_reads.py", line 589, in generate_reads
read1.finalize_read_and_write(error_model_1, fq1, options.produce_fastq)
File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 278, in finalize_read_and_write
self.quality_array = err_model.get_quality_scores(len(self.reference_segment))
File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/models/models.py", line 498, in get_quality_scores
self.rng.normal(self.quality_score_probabilities[i][0],
IndexError: index 151 is out of bounds for axis 0 with size 151
input_read_length length is 162 but quality_score_probabilities length is 151
Yeah, it's related to the previous error. I have a bug in how NEAT is calculating quality scores for deletions. I will post a fix as soon as I can.
-Josh
From: Alberto Labarga @.> Sent: Wednesday, November 1, 2023 2:08 AM To: ncsa/NEAT @.> Cc: Allen, Josh @.>; Comment @.> Subject: Re: [ncsa/NEAT] Unable to run NEAT examples in the README (Issue #89)
ok, thanks, just for your info, I manage to skip the pervious error adding
avg_seq_error: 0
to the config.yml but then I get a different error,
2023-10-31 18:53:43,982:ERROR:neat:read-simulator failed, see the traceback below Traceback (most recent call last): File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/cli/cli.py", line 133, in main cmd(args) File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute read_simulator_runner(arguments.config, arguments.output) File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 353, in read_simulator_runner generate_reads(local_reference, File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/utils/generate_reads.py", line 589, in generate_reads read1.finalize_read_and_write(error_model_1, fq1, options.produce_fastq) File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 278, in finalize_read_and_write self.quality_array = err_model.get_quality_scores(len(self.reference_segment)) File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/models/models.py", line 498, in get_quality_scores self.rng.normal(self.quality_score_probabilities[i][0], IndexError: index 151 is out of bounds for axis 0 with size 151
input_read_length length is 162 but quality_score_probabilities length is 151
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/ncsa/NEAT/issues/89*issuecomment-1788512086__;Iw!!DZ3fjg!_NZriir0MUnOupEmzTpjWhYKjSEUwOd-CWQIvgjgn5Z936g8vXY2aUv90j6xmlsqVpq21JdRjs2G8nTBecO-5naaaR760Q$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AGMI7237NGIWB7KA2TML3GLYCHYO3AVCNFSM6AAAAAA6XRJ3TKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBYGUYTEMBYGY__;!!DZ3fjg!_NZriir0MUnOupEmzTpjWhYKjSEUwOd-CWQIvgjgn5Z936g8vXY2aUv90j6xmlsqVpq21JdRjs2G8nTBecO-5nbA8EjCog$. You are receiving this because you commented.Message ID: @.***>
I pushed a fix to the Develop branch. If you have time, can you test your code on that branch?
This should now be fixed on the main branch an in the current release. Please check out the newest version and open a new ticket if you have any further issues.
Describe the bug Running the 'whole genome simulation' in the readme herehere , this error arisen: raise ValueError("Bad mode %r" % mode) ValueError: Bad mode 'xt'
To Reproduce Steps to reproduce the behavior:
Download the hg19.fa: ```wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
Create a neat_config.yml file with the following content
reference: hg19.fa read_len: 126 produce_bam: True produce_vcf: True paired_ended: True fragment_mean: 300 fragment_st_dev: 30
Execute neat on the command line:
neat read-simulator \ -c neat_config.yml \ -o /home/your path/simulated_reads
See error: raise ValueError("Bad mode %r" % mode) ValueError: Bad mode 'xt'
Expected behavior To get the synthetic data as three different files: fastq, bam and vcf as it is stated in the README
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):