HadrienG / InSilicoSeq

:rocket: A sequencing simulator
https://insilicoseq.readthedocs.io
MIT License
176 stars 32 forks source link

fragment length parameter not working #256

Closed Naturalist1986 closed 2 weeks ago

Naturalist1986 commented 4 months ago

Hi,

I've run this command: iss generate --draft *.fna --abundance uniform -n 10000000 --output mix_bacteria_fungi_150 --cpus 128 --model MiSeq --fragment-length 150 --fragment-length-sd 20

But I'm still getting 300bp length reads:

CCCCCGGGGGGGGGGGFGGGFGGGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGCGGGGGGGGGGGGCFGGGGDGGGGGGGGGGGGGFGGGGGGGBEGEGGGGGFGGGGGGFGGGGFAGGGGGEFGFBGGGFGFGFGFGGGFEGGFFGGGFGGGG:EGGGGGGGGADF?CCECGGGCFGFGGGGFFGGGGGGGFCGFCCFGGGFG?GGFCG?CFE7FFF@9GGE,9FFA>29,9GG8CF>FCFDF:A9>F+/4F/FA7D;F<)9F1D2.F1FF0 @AE017341.1_3_0/1 CTACAACATGCGTTTGAAAagggcagcagcagcagcagtaatcccgccttctccagcaGCATACGCAGTGGTTTCGGCCGGCACAGCCTCGAACAGCTGCGGCAGAATGACTTTTAACTAGAGGCAGGGGGGGGGGGACGGAGTTCTCTGCTCGCTCATCATATATAATAGGAAAAATGGGCAATCGTGGATGCTTGTTATTTAGGATAACCAATAATGCGCAGGggcggaagagagagatgggaaaGGTACATAAGACCGGGTAGACAGGCTTTAGTACTCGCATGTCATCTATTTTATT + CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGFFGGGFGGFGFGGGGGGGGGCGGGGFGGGGGGGGGEGCGGGGGGGGGGGGGGGCGFBGEGGCG:FGFCGGFGFGGFGGGGGGFGGGGGGCGCG>BGG>G@FGABFG7EFCCGGGFF=GGEBGGEGGCG9EGGCG=GEE6EEG9G,GGG@GGGG>CFC?1,,+GCA<G:E73FFC5;C76D8BF/>3:F71FCFG)><F>F)1.77CF) @AE017341.1_4_0/1 CCACTGTCTAGCTGTCACCAAAGAAAACCACCCTGGAGCACGCATCAGCCCAAACACTTGCAGACTTGCAGTCAAGACACCAACTCACGTCACGGTGATATCGTCTCGCATCCACCTCGAAACCTTGCCGCCAAGGCTGACCAACTCGGCTCTCGTCTTGACATCGCCTCCCGCCAAACTGTTCCTGATCAAATGCGTCGCCGCATTCGCGTCGCCTTCGTACACCCACGTATCACTTGCCGCCTCGCCCGTCGGCTCAGGGAGGTCTTGTGCGGGGTATGGACGTTTTTCTCGCGGTGGC

killidude commented 3 months ago

Hi @HadrienG,

Fragment length is also not working for the NovaSeq model (or any other model for that matter). Can you please look into this and fix it. thank you.

Tomas

iss generate --genomes input.fasta --n_reads 10000 --cpus 8 --sequence_type amplicon --model novaseq --fragment-length 250 --fragment-length-sd 0 --output reads

@Potamotrygon_cf_orbignyi_0_0/2 GGTGGGTTTGGAGCACCGCCAAGTCCTTTAGGTTTTAAGCTAGCGCTTGTAGTGTTCTGGCGAATATTGAGGTAGGTTATTAATAGCTGTGTTTATGGTTAAGCATAGTAGGGTATCTAATCCTAGTTTGGGTCTTAACTGTCGTGAGGTC + FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF,FFF,FFF,FFFFFFFFF:FFF:FFFFFFFFFFFFFFF:F,FFFFFFFFFFFFFFFFFFFF,F:FFFFFFFFFFFFFFFFFFFFFFF @Potamotrygon_cf_orbignyi_1_0/2 GGTGGGTTTGGAGCACCGCCAAGTCCTTTAGGTTTTAAGCTAGCGCTTGTAGTGTTCTGGCGAATATTAAGGTAGGTTATTAATAGCTGTGTTTATGGTTAAGCATAGTAGGGTATCTAATCCTAGTTTGGGTCTTAACTGTCGTGAGGTC + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF:FFFFFFFFFFFFF:FFFF,FFFFFFFFFFFF

HadrienG commented 2 weeks ago

Hi,

the fragment length parameter will not modidy the length of the read, but rather the length of the simulated template sequences. The parameter therfore only influence the distribution of the length of the insert size between the forward and reverse reads.

Best, /Hadrien