alyssafrazee / polyester

Bioconductor package "polyester", devel version. RNA-seq read simulator.
http://biorxiv.org/content/early/2014/12/12/006015
89 stars 51 forks source link

fix bug on readlen #59

Closed ghost closed 3 years ago

ghost commented 6 years ago

Hi @alyssafrazee ,

This PR fixes the same error as #24 when specifying readlen in the function:

Error in seq_gtf(gtf, seqpath, ...) : unused argument (readlen = 70)

The error also shows up in the functions simulate_experiment_empirical and simulate_experiment_countmat.

I hope it would be helpful.

Thanks, Renee

alyssafrazee commented 6 years ago

Thanks so much for the PR! I'd be happy to merge; have you tested the change and have you confirmed that it produces expected output? If you could paste small test cases here (or add tests) that would be amazing. Really appreciate it!

ghost commented 6 years ago

Sure @alyssafrazee . Below is a test case using the example from the section 'Using real data to guide simulation' in your bioconductor introduction.

> simulate_experiment_empirical(bg, grouplabels=pData(bg)$group, gtf=gtf, seqpath=chr22seq, mean_rps=5000, outdir='empirical_reads', seed=1247, readlen=51)

When specifying readlen in simulate_experiment_empirical(), I got this error:

Error in seq_gtf(gtf, seqpath, ...) : unused argument (readlen = 51)

This PR fixes the error, and the read lengths in the output files are also correct:

$ head empirical_reads/sample_01_1.fasta 
>read1/TCONS_00000017;mate1:138-188;mate2:308-357
GATTAAGAAAATTGTGCATTCAATTATATCATCCTTTGCATTTGGACTATT
>read2/TCONS_00000020;mate1:205-255;mate2:387-436
CCATGGGGGCCGCACGCAGCCCGCCGTCCGCTGTCCCGGGGCCCCTGCTGG
>read3/TCONS_00000024;mate1:249-299;mate2:438-487
CCTAATATTGTGAAGGCCATGTGCTAAATCCAGCAATCCGCTCCAGTAGGT
>read4/TCONS_00000024;mate1:279-329;mate2:494-543
CAGCAATCCGCTCCAGTAGGTGTCCTTCAGGATACTTTTCTGTCAAGTAAA
>read5/TCONS_00000024;mate1:190-240;mate2:388-437
ACATGTTCCGGTTATTATAGAGAAATTTTGTGGACAAGTATTACCAGGAGT

Thanks, Renee