zstephens / neat-genreads

NEAT read simulation tools
Other
95 stars 27 forks source link

random seed not working? #48

Closed astewart-twist closed 5 years ago

astewart-twist commented 6 years ago

I just generated 3 datasets with 3 different random seeds and they all pairwise diff to 0.

zstephens commented 6 years ago

Greetings, I'm unable to replicate this:

python genReads.py -r chr1_subset_small.fa -R 101 -o test1
python genReads.py -r chr1_subset_small.fa -R 101 -o test2
python genReads.py -r chr1_subset_small.fa -R 101 -o test3

MD5 (test1_read1.fq) = 73b188f3a48bb0e320df6e0c97543b92
MD5 (test2_read1.fq) = c351c251be6a2b44c00236fe8c5cc822
MD5 (test3_read1.fq) = 20fdd92c144c3355058eebec87aa7806

(and manually looking at the .fq files I see the reads are different)

Could you provide me with the command lines you used to generate the datasets you observed to be identical? Were you referring to FASTQ, BAM, of VCF output?

astewart-twist commented 6 years ago

Thanks for the response @zstephens - I indeed get the same result as you with the above (equivalent) example. I believe I've tracked my observation down to a labeling error on my side where the data in question were generated without any errors (but different random seeds) due to other parameters.

By the way - I see CLI arguments for setting error rate modifiers, but is there a way to see the errr rates/distributions of the default models?