VCF2GDS - Githubissues

xihaoli / STAARpipeline-Tutorial

The tutorial for performing single-/multi-trait association analysis of whole-genome/whole-exome sequencing (WGS/WES) studies using FAVORannotator, STAARpipeline and STAARpipelineSummary

GNU General Public License v3.0

21 stars 17 forks source link

VCF2GDS #35

Closed tytrhr closed 7 months ago

tytrhr commented 8 months ago

hello，may I ask whether the vcf file needs to be divided by chromosome before vcf2GDS processing?

xihaoli commented 8 months ago

Hi @tytrhr,

Thanks for your question. You are correct. STAARpipeline takes genotype data by chromosome to analyze large sequencing data, and thus you may split VCF files by chromosome before VCF2GDS processing.

Best, Xihao

tytrhr commented 8 months ago

Thanks for your reply, I found a script that can implement this step, “Rscript convertVCF2GDS.R import.format vcf base.filename 2 chr22.vcf.gz chr2.vcf.gz”, I would like to ask if this command setting is correct? In addition, the vcf file is split, so does this step require a single processing? I find that when I run this scripts, the results are merged together, only base.filename.gds. Looking forward to your reply, thank you very much!

xihaoli commented 8 months ago

Hi @tytrhr, do you have all VCF files split by each chromosome already?

tytrhr commented 8 months ago

Hi, xihao, yes, I have. Thank you very much for your reply! In addition, I would like to ask, how does the linux system correctly download the FAVORannotator's database? I find the FAVORannotator's database on the https://favor.genohub.org, such as "chr1 CSV (31.2 GB) Download", right click on the "Download" to copy address, use wget to download, but the file is 6170506 and can not be used; directly click "Download" and then upload the linux system, the file is chr1.tar.gz and can be used correctly. I don't know what causes this, can you give me some suggestions?

xihaoli commented 8 months ago

Hi @tytrhr,

Thanks for letting me know. Both ways would result in the same correct file. If you are using wget for download, please feel free to follow this R script for the next step of FAVORannotator.

Best, Xihao