knausb / vcfR

Tools to work with variant call format files
240 stars 54 forks source link

Convert vcfR object to tabular format #181

Closed Al-Murphy closed 3 years ago

Al-Murphy commented 3 years ago

Hi,

Thank you for a great package. I was wondering if you have incorporated a function to convert from the summary statistics vcfR object to a standard tabular format? I'm hoping to integrate summary statistics from older GWAS which wouldn't have a VCF.

Kind regards, Alan.

knausb commented 3 years ago

Hello,

I'm afraid I do not understand your question. The VCF format is a defined specification (http://samtools.github.io/hts-specs/). Tabular data would be whatever anyone can insert into a table. In R this would be a matrix or a data.frame. We have the function extract.gt() to pull elements out of the gt slot. Please let me know if this was what you were looking for.

Al-Murphy commented 3 years ago

Hey @knausb,

Thank you very much for your quick reply however I'm afraid this isn't what I'm looking for. Essentially what I'm trying to do is to take a vcfR object and convert it to a matrix/dataframe/datatable/txt file with something the following information:

          CHROM      POS          ID REF ALT      ES     SE       LP     AF
       1:     1    10583  rs58108140   G   A  0.011 0.0488 0.969267 0.5693
       2:     1    10583  rs58108140   G   A  0.011 0.0488 0.969267 0.5693
       3:     1    10583  rs58108140   G   A  0.011 0.0488 0.969267 0.5693
       4:     1    10583  rs58108140   G   A  0.011 0.0488 0.669267 0.5693
       5:     1    30923    rs806731   G   T -0.0012 0.0435 0.426854 0.2384
      ---   

This is something I can do by just reading in a vcf file to R and removing all lines startign with '##' which will leave this information. Obviously, if there is already a method to do this inside of vcfR though I would rather use that.

knausb commented 3 years ago

Hi Alan,

You appear to be asking how to take VCF data and make it non-VCF. Because of this I see this as outside the scope of vcfR. It sounds like you've found a solution so there is nothing here to be done. In R the "apply" functions are useful and efficient ways to "apply" a function over many rows. Perhaps that may help.

Good luck! Brian