YuSugihara / QTL-seq

QTL-seq pipeline to identify causative mutations responsible for a phenotype
45 stars 24 forks source link

Suggestion #7

Closed mdga337 closed 3 years ago

mdga337 commented 4 years ago

Hello ! First time working with genomic data and this has made my life extremely easy ! just wondering if it would be possible to also add the DP field in the bcftools mpileup so the vcf file can be processed with QTLseqR, or if you have any plans on adding the G statistics that they use papers:

https://acsess.onlinelibrary.wiley.com/doi/10.3835/plantgenome2018.01.0006 (QTLseqR) https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002255 (G statistics)

Also not that is needed, since i could probably figure out how to do it, but it would be nice to have an extra python code that would extract the significant regions into a new VCF to be used in snpEff (my genome data is not included, need to crate a new dataset) or if the snpEff option is turned on extract only the significant regions from the txt file that snpEff generates.

hope these comments find you well, and help you improve even more this amazing tool !

Thanks !

MG

YuSugihara commented 3 years ago

Thank you for your suggestion.

I think the easiest way to add the DP fields is to run from FASTQ files, but the script to add DP fields is maybe useful for some people.

In terms of G statistics, I think G statistic is Nei's Fst, right? Fst can calculate the variance between two population without the parental sequence. I would like to adopt it in the future.

Thank you