Closed mfazel closed 8 months ago
Hello, Thanks for the note -- we are checking on the datasets you listed. Michal
Hello, As far as the number of datasets in progress -- this number relates to all available datasets in GEO that we haven't had a chance to process yet -- and we are processing them fast as we can. They are in a lower priority queue so user submitted datasets get a priority. Regarding the pipeline -- it should be working. Your particular examples might not have data in SRA -- let us know if you think otherwise and we'll dig deeper into what might be going on. Thanks, Michal
Hi Michal, I checked a few other datasets and non of the reasons given after submitting and failing the dataset for process using the queue are true, same as the above examples. I came up with a possible reason, that is, I know GREIN uses a list of datasets from GEO and if any dataset does not exist in that list, wont be processed and that list is not updated at least during the past year or not being automated. One way to find it out, is to check what datasets are being processed right now, if any, and then see their release date in GEO, probably non from 2022 or even 2021. This is just my guess but likely to be true. Thanks
Hi guys, I was wondering why the number of in progress datasets are more than the processed ones (first page plot). I noticed that something probably has changed and the pipeline can no longer download data from GEO. I checked several recently published data on GEO and when tried to analyze, they either had been tried before by someone else and failed or if I submit it for analysis, it fails the download step shortly after extracting metadata. (ie GSE125422, GSE159067...). It seems many dataset names and metadata have been added to the database list but failed to download and process. Any idea?