MRCIEU / TwoSampleMR

R package for performing 2-sample MR using MR-Base database
https://mrcieu.github.io/TwoSampleMR
Other
432 stars 176 forks source link

problems with outcome data #130

Closed meghanmorrison closed 5 years ago

meghanmorrison commented 5 years ago

Hello,

I am trying to carry out Mendelian randomisation using 1 SNP. How would I extract a single SNP from my own summary statistics outcome data (i.e. data which is not found online). The exposure SNP I am using can be found in the GWAS catalog. I have managed to successfully follow the steps in MR BASE to extract this single exposure SNP from the GWAS catalog. However I'm trying to extract the same SNP from my own data to use as the outcome SNP, but cannot seem to do this. Any suggestions?

Thank you for your time, it's greatly appreciated!

meghanmorrison commented 5 years ago

Using the example data as an example, I think I have managed to get this command to work:

outcome_dat <- read_outcome_data( snps = bmi_exp_dat$SNP, filename = "gwas_summary.csv", sep = ",", snp_col = "rsid", beta_col = "effect", se_col = "SE", effect_allele_col = "a1", other_allele_col = "a2", eaf_col = "a1_freq", pval_col = "p-value", units_col = "Units", gene_col = "Gene", samplesize_col = "n" )

However then when I go to the Harmonise Data step, I can't get this to work:

dat <- harmonise_data( exposure_dat = bmi_exp_dat, outcome_dat = chd_out_dat )

For this previous command to work I need this command to work - however I cannot figure out how to get this to work:

chd_out_dat <- extract_outcome_data( snps = bmi_exp_dat$SNP, outcomes = 7 )

I hope this makes sense. I have used the example data in my explanation to hopefully make it more clear. Thank you for your help and your time!

explodecomputer commented 5 years ago

hey can you send the error that you get please?

meghanmorrison commented 5 years ago

Hello, thank you for your reply. Since my outcome data can't be found in the GWAS catalog, I'm not sure what "outcomes" (Array of IDs (see id column in output from available_outcomes) will be, does this make sense?

chd_out_dat <- extract_outcome_data( snps = bmi_exp_dat$SNP, outcomes = 7 )

I tried just taking it out but got this error:

Error in unique(outcomes) : argument "outcomes" is missing, with no default

Thanks again for your time!

explodecomputer commented 5 years ago

If you have local outcome gwas data then this is the correct function to use to read in the SNP you want:

outcome_dat <- read_outcome_data(
snps = bmi_exp_dat$SNP,
filename = "gwas_summary.csv",
sep = ",",
snp_col = "rsid",
beta_col = "effect",
se_col = "SE",
effect_allele_col = "a1",
other_allele_col = "a2",
eaf_col = "a1_freq",
pval_col = "p-value",
units_col = "Units",
gene_col = "Gene",
samplesize_col = "n"
)

You would specify your own parameters for the file that you have. Check that it reads in the SNP, if it doesn't then it implies that that SNP is not present in your outcome dataset.

After that, run the harmonise_data function for your exposure and outcome datasets

meghanmorrison commented 5 years ago

I'll give this a try - thank you so much!

explodecomputer commented 5 years ago

can you show me what the first few lines of your Stage1_results.txt file looks like?

explodecomputer commented 5 years ago

if you just read the table in using

a <- fread("Stage1_results.txt", header=TRUE, sep=",")
head(a)

Does it give something sensible? If it does then you can just pass extract the SNP you want and pass that object to format_data

b <- subset(a, MarkerName == "your_rsid")
format_data(b, ...arguments...)
meghanmorrison commented 5 years ago

Thank you for your help!