uc-bd2k / GREIN

GREIN : GEO RNA-seq Experiments Interactive Navigator
https://shiny.ilincs.org/grein
GNU General Public License v2.0
48 stars 19 forks source link

The following datasets are not complete in terms of sample size. #28

Open Dr3753 opened 3 months ago

Dr3753 commented 3 months ago

GREIN is a fantastic tool for exploring RNA-seq data, and I greatly appreciate it. However, there appears to be an issue where certain datasets include only 20 samples each, which is not consistent with the sample size listed in GEO. It seems there might be some bugs present. Could I re-procressing the following datasets? GSE184941 GSE190504 GSE180280 GSE183947 GSE189757

GSE146009 GSE162960 GSE165255 GSE183984 GSE107422

GSE179746 GSE158420 GSE171415 GSE142441 GSE172356

GSE181273 GSE133626 GSE147493 GSE179252 GSE184336

GSE113255 GSE126304 GSE127165 GSE142083 GSE173855

GSE112026 GSE179351 Than you very much!!

Mario-Medvedovic commented 3 months ago

Thank you for pointing this out. We were aware that there were occasional cases when this happened, but did not know there were so many of them. We will re-run these.

I am curious, how did you compile the list? Is this an exhaustive list of datasets with this problem, of they are just datasets that you were interested in and they happened to be problematic?

Dr3753 commented 3 months ago

@Mario-Medvedovic Thank you for your reply! I am an oncologist and I am currently systematically searching for RNA data of tumors. My search criteria include: 1) human solid tumor tissue, 2) bulk RNA-seq, and 3) data from 2009 onwards. I have checked the number of datasets available from the 171 included studies and have compiled the list above. Although I only need four of them, I thought it would be appropriate to report all of them to you. By the way, this platform is very helpful, and I have recommended it to my colleagues. They all agree.

Mario-Medvedovic commented 3 months ago

Thank you for info. As I said, I will re-run these. I will also run a comprehensive check over all datasets. It is very gratifying to hear that somebody like yourself finds GREIN useful.

Mario-Medvedovic commented 2 months ago

All datasets in the list above have been re-processed now, and all except one (GSE184336) are available in GREIN. For GSE184336, our pipeline failed in extracting fastq files. We intent to troubleshoot, but this may take a while.

Dr3753 commented 2 months ago

@Mario-Medvedovic Thank you for the update on the reprocessing of the datasets. It was completed much faster than expected, and it has already helped a lot. There isn't much I could help, I will continue to report any bugs I encounter in the future as a form of support. Thank you again!