eQTL-Catalogue / eQTL-Catalogue-resources

42 stars 35 forks source link

Request for Information on sQTL "*.all.tsv.gz" Download Locations #35

Closed yangchuhua closed 11 months ago

yangchuhua commented 1 year ago

To Whom It May Concern,

First and foremost, I would like to express my gratitude for your effort in creating and maintaining the eQTL Catalogue, which has been an invaluable resource for the research community.

I am currently working on a project that requires xQTL data, specifically in the form of ".all.tsv.gz" files for xQTL (quant_method:exon/transcript/transcript reverse/Leafcutter). While I have been able to locate the ".all.tsv.gz" files for eQTLs on the EBI FTP server (http://ftp.ebi.ac.uk/pub/databases/spot/eQTL/sumstats/), I have not been able to find the corresponding files for the xQTL types mentioned above.

Could you please provide guidance on where I can download the "*.all.tsv.gz" files for these xQTLs? Any assistance you can provide would be greatly appreciated, as it would significantly aid my research project.

Thank you in advance for your help. I look forward to hearing from you soon.

Best regards, Chuhua

kauralasoo commented 1 year ago

Dear Chuhua,

Unfortunately we are unable to share the .all.tsv.gz files for the transcript-level quantification methods (exon, transcript, txrevise, Leafcutter) due to their very large size (10+ Tb).

What is the specific use-case that you have in mind? If you are interesting in colocalisation then you should be able to use the .cc.tsv.gz files which contain the full summary statistics for those molecular traits that had at least one significant QTL.

Best wishes, Kaur

yangchuhua commented 1 year ago

Dear Kaur,

Thank you very much for your reply.

PredictDB has developed prediction MASHR-based models for sQTLs, based on the GTEx v8 release data (https://predictdb.org/post/2021/07/21/gtex-v8-models-on-eqtl-and-sqtl/). ↗.)

I am interested in building MASHR-based models using your comprehensive summary statistics of sQTLs (Leafcutter).

Would you be able to share the ".all.tsv.gz files" corresponding to the following list?

study_id dataset_id study_label sample_group tissue_id tissue_label condition_label sample_size quant_method
QTS000011 QTD000089 FUSION muscle_naive UBERON_0001134 muscle naive 288 leafcutter
QTS000011 QTD000094 FUSION adipose_naive UBERON_0001013 adipose naive 271 leafcutter
QTS000019 QTD000377 Lepik_2017 blood UBERON_0000178 blood naive 471 leafcutter
QTS000029 QTD000538 TwinsUK fat UBERON_0001013 adipose naive 381 leafcutter
QTS000029 QTD000548 TwinsUK skin UBERON_0002097 skin naive 370 leafcutter
QTS000029 QTD000553 TwinsUK blood UBERON_0000178 blood naive 195 leafcutter

If sharing these files is not possible, I kindly request that you run the script (https://github.com/stephenslab/gtexresults/blob/master/workflows/fastqtl_to_mash.ipynb) ↗) on your ".all.tsv.gz files" and share the resulting R objects of the MASH input generated by the script.

Best wishes, Chuhua

kauralasoo commented 1 year ago

Dear Chuhua,

Yes, that should be possible. Please send me an email at kaur.alasoo@ut.ee and we can arrange data transfer.

Best, Kaur

yangchuhua commented 1 year ago

Dear Kaur,

Thank you very much.

I have email you at kaur.alasoo@ut.ee.

Best, Chuhua