WGLab / PennCNV

Copy number vaiation detection from SNP arrays
http://penncnv.openbioinformatics.org
Other
88 stars 53 forks source link

Obtain Population frequency of B allele Axiom_UKB_WCSG #36

Closed npatel22526 closed 5 years ago

npatel22526 commented 5 years ago

Hello Dr. Wang,

I am trying to use PennCNV with Axiom_UKB_WCSG array but not sure which one is the right PFB file used in step 1.3 http://penncnv.openbioinformatics.org/en/latest/user-guide/affy/.

I checked file "Axiom_GW_HumanOrigin.hg38" provided with pennCNV but when I overlapped SNP IDs with the one in annotation files from Axio it only overlap 70K ids.

I have also downloaded a file (Axiom_UKB_WCSG.na35.af_supp.tab) from affemetrix website that contains allele frequencies from "1KGp1_mar2012 and Affymetrix_internal_screen". The file contains frequencies from various populations , below are few column names from header

Probe Set ID Affy SNP ID Freq_A ASW Freq_B ASW Heterozygosity ASW Number ASW Minor_Allele ASW MAF ASW Freq_A CEU Freq_B CEU

Is this the right file to obtain B allele frequencies for PBF file ?

Best, Nick

kaichop commented 5 years ago

You can use compile_pfb.pl to generate his file yourself after you have all signal files.

Sent from my iPhone

On Jan 30, 2019, at 5:25 PM, Nihir notifications@github.com wrote:

Hello Dr. Wang,

I am trying to use PennCNV with Axiom_UKB_WCSG array but no sure what is the right PFB file required in step 1.3 http://penncnv.openbioinformatics.org/en/latest/user-guide/affy/.

I checked file "Axiom_GW_HumanOrigin.hg38" provided with pennCNV but when I overlapped SNP IDs with the one in annotation files from Axio it only overlap 70K ids.

I have also downloaded a file (Axiom_UKB_WCSG.na35.af_supp.tab) from affemetrix website that contains allele frequencies from "1KGp1_mar2012 and Affymetrix_internal_screen". The file contains frequencies from various populations , below are few column names from header

Probe Set ID Affy SNP ID Freq_A ASW Freq_B ASW Heterozygosity ASW Number ASW Minor_Allele ASW MAF ASW Freq_A CEU Freq_B CEU

Is this the right file to obtain B allele frequencies for PBF file ?

Best, Nick

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

npatel22526 commented 5 years ago

Dr. Wang, Thanks for quick reply , I read about script "compile_pfb.pl" but as you stated, the script uses individual intensity files , which was explained in documentation in step 1.2. However, in the step I couldn't find description to generate the intensity files for Axiom data.
Should I extract signal intensity for each sample from "AxiomGT1.summary.txt" generated in Step1?

Best, Nick

kaichop commented 5 years ago

I do not have experience on Axiom, but the instructions are on the website based on other people's experience. You can just follow the instructions to get signal data and normalize signal data and then use it. You should not need to do any extraction yourself. For PFB file, you can simply create one yourself, using 0.5 as the allele frequency for all SNPs. The purpose is only to provide coordinate information for SNPs. Later on, after you have all signal files, you can then generate a more accurate PFB file for CNV calling.

On Thu, Jan 31, 2019 at 8:44 AM Nihir notifications@github.com wrote:

Dr. Wang, Thanks for quick reply , I read about script "compile_pfb.pl" but as you stated, the script uses individual intensity files , which was explained in documentation in step 1.2. However, in the step I couldn't find description to generate the intensity files for Axiom data. Should I extract signal intensity for each sample from "AxiomGT1.summary.txt" generated in Step1?

Best, Nick

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WGLab/PennCNV/issues/36#issuecomment-459348665, or mute the thread https://github.com/notifications/unsubscribe-auth/AFptuFp9d_aQ1uElblm9r_TpwgW4xz2pks5vIvM6gaJpZM4abL5m .

npatel22526 commented 5 years ago

Thank you very much again for prompt reply. This is helpful, I understand how to proceed now.

Best, Nick

oshintogla commented 2 years ago

Dr. Wang, I am working on Illumina 770k HD chip for 96 samples of cattle. I am not able to generate pfb files for my data as I am unable to get the idea of the exact structure input files. Please explain clearly. Also specify the minimum number of samples for which pfb file can be generated. How to generate hmm and GC model file for this data or the one given in the module can be used? Thankyou!