karel-brinda / rnftools

RNF framework for NGS: simulation of reads, evaluation of mappers, conversion of RNF-compliant data.
http://karel-brinda.github.io/rnftools
MIT License
14 stars 5 forks source link

Mark random reads that are mixed in by simulator in LRN (genome ID) #62

Closed sibyllewohlgemuth closed 7 years ago

sibyllewohlgemuth commented 7 years ago

If random reads are generated by the read simulator (eg. CuReSim -r ) and these reads are marked in the read name of the original simulator, it would be nice if they would have a different genome ID in the RNF name.

karel-brinda commented 7 years ago

Could you provide a specific Snakefile that you are using in this case? CuReSim random reads are switched off in RNFtools, see this line. If you want to simulate contamination, it should be added through a separate source.

I could add a random mode option to rnftools.mishmash.CuReSim. Then your Snakefile could look like this:

# non-random reads
rnftools.mishmash.CuReSim(
       fasta=fa,
       number_of_read_tuples=10000,
       read_length_1=100,
       read_length_2=0,
)
# random reads
rnftools.mishmash.CuReSim(
       fasta=fa,
       number_of_read_tuples=2000,
       read_length_1=100,
       read_length_2=0,
       random=True,
)
sibyllewohlgemuth commented 7 years ago

Yes, exactly I wanted to mix in random contamination and i thougt I could do it with

rnftools.mishmash.CuReSim(
    fasta=fa,
        number_of_read_tuples=10000,
    read_length_1=100,
    read_length_2=0,
    other_params='-r 1000'
)

To have a random mode option would be useful! Then I can add the contamination through a separate source and it would have different genome ID.

karel-brinda commented 7 years ago

@sibyllewohlgemuth I added this feature, see the curesim_random branch. Could you check if it does what you expect? Thanks.

Example:

rnftools.mishmash.CuReSim(
    fasta=fa,
    number_of_read_tuples=10000,
    read_length_1=100,
    read_length_2=0,
)

rnftools.mishmash.CuReSim(
    fasta=fa,
    number_of_read_tuples=10000,
    read_length_1=100,
    read_length_2=0,
    random_reads=True,
)
karel-brinda commented 7 years ago

You can use the following command to install RNFtools from that branch:

pip install --upgrade https://github.com/karel-brinda/rnftools/archive/curesim_random.zip
sibyllewohlgemuth commented 7 years ago

Hi Karel, I tried the random mode and it does what I expected! Thank you!