Open jenkelly10 opened 4 years ago
same question, did you find a solution to this?
Hello! Thank you for this wonderful program! I think it will be invaluable for working with rare variant data! I have a quick question... when running the script for counting controls and working with the gnomad.exome.r.2.1.1.vcf data, how do we indicate a subpopulation of the data? For instance, if I want to work only with the control-nfe data or the non-cancer_nfe data rather than all NFE samples? Thanks so much for your time and consideration!
Do you know how to work for only NFE samples? I did not understand how that was achieved from the downloaded gnomAD vcf file?
A plausible solution that I could think of was if the allele is present in a particular position then the [allele number should be greater than zero] controls_NFE_AN > 0 & (controls_NFE_AC >= 0 | controls_NFE_AF >=0). This could subset the VCF file for variants present in that population.
I found an easy way to indicate the data of subpopulation, if you used the gnomad dada, you should add a parameter '-d gnomad' in Counting carriers in public control cohorts; if you used the ExAc data, you should add a parameter '-d exac' in Counting carriers in public control cohorts. if the parameter did not add, the code just set the input data as generic, which led to the useless of --pop parameter. I hope this was helpful.
Hello! Thank you for this wonderful program! I think it will be invaluable for working with rare variant data! I have a quick question... when running the script for counting controls and working with the gnomad.exome.r.2.1.1.vcf data, how do we indicate a subpopulation of the data? For instance, if I want to work only with the control-nfe data or the non-cancer_nfe data rather than all NFE samples? Thanks so much for your time and consideration!