Open zhangpicb opened 1 year ago
Hi,
This is expected behaviour. Even though the RNG has been set with the right seed, you are using multi-threading causing the RNG to be accessed independently and unpredictably by multiple threads. If you analysis needs to be reproducible, you should run SHAPEIT5 with one thread.
The reason why IMPUTE5 has always the same output is because the SNP array imputation is actually deterministic and not stochastic (so no calls to the RNG).
Hope this helps, Simone
Dear @srubinacci
Thanks for your reply!
I set THREADS=1 in my scripts,and still get 2 different results after imputation.
Thanks in advanced!
Best, zhang
Hi,
Happy to help. Do you mean after phasing, with threads=1? This should not be the case, if the same seed is set. We always use the same RNG, so the stochastic calls should be reproducible now.
Can you please check carefully and if that is the case in your tests, and if so, report a reproducible example with some public data if that is the case? In my experiments, I have never experienced this problem with single thread.
Thanks
Simone
Dear @srubinacci @jaredo @diogomribeiro @odelaneau
This is reproducible problems with 1KGP data.
unphased chr22 data download from https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/cgi_variant_calls/
This is my code. And same Linux server get 2 different results.
nohup phase_common_static --input 1000genomes/20130502_supporting/cgi_variant_calls/filtered_calls/ALL.chr22.cgi_multi_sample.20130725.snps_indels.high_coverage_cgi.normalized.uniq.genotypes.gtonly.cr90.ic10.vcf.gz \
--map /tools/shapeit5/resources/maps/b37/chr22.b37.gmap.gz --output chr22.bcf --thread 1 --log chr22.log --filter-maf 0.001 \
--region 22 --output-format bcf && bcftools index -f chr22.bcf --threads 1 &
bcftools convert chr22.bcf -Oz -o chr22.vcf.gz
tabix -p vcf chr22.vcf.gz
zcat chr22.vcf.gz|grep -v ^# > chr22.txt
md5sum chr22.txt
The results,the last person are different
22 16555011 . C T . . AC=36;AN=866 GT 0|0 0|0 0|1 0|1 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|1 0|0 0|0 0|0 0|0 0|0 0|1 0|0 1|0
22 16555011 . C T . . AC=36;AN=866 GT 0|0 0|0 1|0 1|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 1|0 0|0 0|0 0|0 0|0 0|0 1|0 0|0 1|0
Thanks in advanced!
Dear @srubinacci @jaredo @diogomribeiro @odelaneau
Results are not reproducible. This is a bug or something else?
Thanks in advanced!
Dear @jaredo @diogomribeiro @odelaneau @srubinacci @rwk-unil
Thanks for this great tools.
I find that the shapeit5 outputs are different when I use same input data with same default seed 15052011.
How to make the shapeit5 results reproducible?
By the way, Impute5 output are same when I use same input .
Thanks in advanced!
Best, zhang