MRCIEU / TwoSampleMR

R package for performing 2-sample MR using MR-Base database
https://mrcieu.github.io/TwoSampleMR
Other
429 stars 176 forks source link

no data when extracting ubm datasets #158

Open SvenGenAm opened 4 years ago

SvenGenAm commented 4 years ago

I am trying to extract from the ubm dataset all<-extract_instruments(outcomes="ubm-a-449") however all does not contain data, other ubm also do not work. Other traits are working fine. I am logged in.

what am i doing wrong?

Addition after downloading a vcf and checking manually: in the ubm files there are -log10 p-values, filtering does not work then?

Additional question on the 'ebi' dataset: is there a reason the eaf is missing? In previous gwas datasets this was present (and it is shared in other UKB datasets)

explodecomputer commented 4 years ago

The issue here is that that trait just didn't have any GWAS significant hits. This is the case for most of those brain MRI traits unfortunately. With the EBI dataset, we copied across their harmonised data, if they don't provide eaf then we don't include it. However, you can assume that everything in those datasets is on the forward strand, so eaf isn't strictly required for harmonising (harmonise_data(a,b,action=1))

explodecomputer commented 4 years ago

sorry didn't understand the question about the vcf files

SvenGenAm commented 4 years ago

So even if you put the cuttoff at 1e5 you do not get a subset of the data? Do i understand it correct? Thank you Sven

Op do 6 feb. 2020 20:57 schreef gibran hemani notifications@github.com:

The issue here is that that trait just didn't have any GWAS significant hits. This is the case for most of those brain MRI traits unfortunately. With the EBI dataset, we copied across their harmonised data, if they don't provide eaf then we don't include it. However, you can assume that everything in those datasets is on the forward strand, so eaf isn't strictly required for harmonising (harmonise_data(a,b,action=1))

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MRCIEU/TwoSampleMR/issues/158?email_source=notifications&email_token=AJBYWAJ36ZB5EYTANLGN2PDRBRTTDA5CNFSM4KQROKB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELASDGI#issuecomment-583082393, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJBYWAKZ5LROGRWG55XHQI3RBRTTDANCNFSM4KQROKBQ .

explodecomputer commented 4 years ago

Hi Sven Thanks for this information. I have finally had a chance to look into these data, it looks like the p-values that we downloaded were logged, so the database has them stored incorrectly. We've disabled these datasets temporarily and are regenerating them. They'll be done soon, will let you know when they're up. We're also generating reports for every dataset so that this kind of thing can be caught more easily. Thanks again for the info

SvenGenAm commented 4 years ago

no problem, your conclusion was the same as mine. Is there a description of all the columns that are in the table from: ao <- available_outcomes()

Sorry to have missed it if it is online.

Op za 15 feb. 2020 om 10:31 schreef gibran hemani <notifications@github.com

:

Hi Sven Thanks for this information. I have finally had a chance to look into these data, it looks like the p-values that we downloaded were logged, so the database has them stored incorrectly. We've disabled these datasets temporarily and are regenerating them. They'll be done soon, will let you know when they're up. We're also generating reports for every dataset so that this kind of thing can be caught more easily. Thanks again for the info

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MRCIEU/TwoSampleMR/issues/158?email_source=notifications&email_token=AJBYWANML2IWL4WOHNFRYR3RC6Y7FA5CNFSM4KQROKB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL3GCZA#issuecomment-586572132, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJBYWAMWYSQKI6JKOF7Q2V3RC6Y7FANCNFSM4KQROKBQ .

explodecomputer commented 4 years ago

@SvenGenAm We've restored the ubm-a data batch in the database now, please try again using the ieugwasr or TwoSampleMR R packages. The VCF files are still being copied across, will let you know when they are updated