zstephens / neat-genreads

NEAT read simulation tools
Other
95 stars 27 forks source link

SequenceContainer.py fails with missing attribute in Seq object #103

Open Npaffen opened 1 year ago

Npaffen commented 1 year ago

Describe the bug A clear and concise description of what the bug is.

When I try to run gen_reads.py with a custom vcf for the variants to add to output some golden bam with a reduced coverage the pipeline breaks after reading the vcf.

My pipeline looks like this.

call = glue("python3 neat-genreads-master/gen_reads.py -r {reference} -R 150 -o data/1kg_hg37/bam/sample1/ --bam -v data/1kg_hg37/sample_1_chr.vcf -c 0.04")
system(call)
Using default sequencing error model.
Warning: Read length of error model (101) does not match -R value (150), rescaling model...
Using default gc-bias model.
found index hg19.fa.fai
--------------------------------
  reading input VCF...

found 4128997 valid variants in input vcf.
* 76719664 variants skipped: (qual filtered / ref genotypes / invalid syntax)
* 1524 variants skipped due to multiple variants found per position
--------------------------------
  reading chr1... 
175.558 (sec)
found 323780 valid variants for chr1 in input VCF...
161 variants skipped...
- [0] ref allele does not match reference
- [1] attempting to insert into N-region
- [160] alt allele contains non-ACGT characters
--------------------------------
  sampling reads...
[Traceback (most recent call last):
    File "neat-genreads-master/gen_reads.py", line 901, in <module>
    main()
  File "neat-genreads-master/gen_reads.py", line 624, in main
  all_inserted_variants = sequences.random_mutations()
  File "neat-genreads-master/source/SequenceContainer.py", line 591, in random_mutations
  temp = self.sequences[i].tomutable()
  AttributeError: 'Seq' object has no attribute 'tomutable'
joshfactorial commented 1 year ago

Please check that your version of biopython is the most recent. That error usually pops up because of an older version of biopython.

Get BlueMail for Androidhttps://bluemail.me On Jul 12, 2023, at 6:51 PM, Npaffen @.**@.>> wrote:

Describe the bug A clear and concise description of what the bug is.

When I try to run gen_reads.py with a custom vcf for the variants to add to output some golden bam with a reduced coverage the pipeline breaks after reading the vcf.

My pipeline looks like this.

call = glue("python3 neat-genreads-master/gen_reads.py -r {reference} -R 150 -o data/1kg_hg37/bam/sample1/ --bam -v data/1kg_hg37/sample_1_chr.vcf -c 0.04") system(call) Using default sequencing error model. Warning: Read length of error model (101) does not match -R value (150), rescaling model... Using default gc-bias model. found index hg19.fa.fai

reading input VCF...

found 4128997 valid variants in input vcf.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/zstephens/neat-genreads/issues/103__;!!DZ3fjg!-zuRYiq2YG7qPFTKN9fQiXXZxbSMQ6IISuJqClqRbZcBfUSZK9kMM3Rolh6hsEnDldSANY_c1PknisNdCGNpDY6fPV5YFw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AGMI7264RZB6V2XK7TRGKL3XP4Z63ANCNFSM6AAAAAA2IFJ7GA__;!!DZ3fjg!-zuRYiq2YG7qPFTKN9fQiXXZxbSMQ6IISuJqClqRbZcBfUSZK9kMM3Rolh6hsEnDldSANY_c1PknisNdCGNpDY4PQ8WqHw$. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Npaffen commented 1 year ago

I have the most recent biopython version installed. See :

pip install biopython update
Requirement already satisfied: biopython in /home/nils/.local/lib/python3.10/site-packages (1.81)
Requirement already satisfied: update in /home/nils/.local/lib/python3.10/site-packages (0.0.1)
Requirement already satisfied: numpy in /home/nils/.local/lib/python3.10/site-packages (from biopython) (1.23.5)
Requirement already satisfied: style==1.1.0 in /home/nils/.local/lib/python3.10/site-packages (from update) (1.1.0)
(base) nils@Gigapepe2:$ python3 --version
Python 3.10.11
joshfactorial commented 1 year ago

Actually, so what I would recommend is instead of this repo, which is no longer maintained, you check out our newest work on github.com/ncsa/NEAT. If you want to maintain the same functionality as this version, then checkout the latest release of version 3. Otherwise, you can also try version 4, which is different. I believe we've resolved this bug in that more up-to-date repo.

joshfactorial commented 1 year ago

Unless biopython overhauled their mutable sequence code again, in which case I'll have to do some more investigation. Try using the latest and post a bug on that page if it still isn't working. Thanks!