AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
129 stars 19 forks source link

Dataset Request GSE48624 #3510

Open sjwright04 opened 2 months ago

sjwright04 commented 2 months ago

Hi, me and my team are trying to do a project using the dataset linked below. We have tried downloading the dataset multiple times, but none have included the expression matrix .tsv file all we are getting is the .json and .tsv metadata files. We hope the expression matrix is available as we cannot do our project without it. https://www.refine.bio/experiments/GSE48624/the-effect-of-listening-to-music-on-human-transcriptome

davidsmejia commented 2 months ago

Hello and thanks for making this issue. I'm an engineer at the lab and I was able to confirm your results. I received a success email and when I opened up the zip it was also missing the expression matrix. I took a look at the logs for the job that compiled the download and It looks like there is an issue with the data in this experiment. I will open an issue that aims to resolve the issue you are encountering but I can't say exactly when the changes will be deployed and make this dataset available.

sjwright04 commented 2 months ago
Ok, thanks for the quick response. Do you think there is a chance it could be fixed by Wednesday of next week? No worries if not, we can choose a different data set for our project, but we found this one particularly interesting. Sent from Mail for Windows From: DavidSent: Friday, August 30, 2024 12:40 PMTo: AlexsLemonade/refinebioCc: sjwright04; AuthorSubject: Re: [AlexsLemonade/refinebio] Dataset Request GSE48624 (Issue #3510) Hello and thanks for making this issue. I'm an engineer at the lab and I was able to confirm your results. I received a success email and when I opened up the zip it was also missing the expression matrix. I took a look at the logs for the job that compiled the download and It looks like there is an issue with the data in this experiment. I will open an issue that aim to resolve the issue you are encountering but I can't say exactly when the changes will be deployed and make this dataset available.—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***> 
davidsmejia commented 2 months ago

I don't expect this to be available on refinebio by Wednesday. However, we aren’t processing from raw, so if the quantile normalization and gene identifier conversion isn’t important you can download the expression data from GEO directly.

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE48624