pepkit / geofetch

Builds a PEP from SRA or GEO accessions
https://pep.databio.org/geofetch/
BSD 2-Clause "Simplified" License
45 stars 5 forks source link

No files found and No data to save when trying to download GSE files #118

Closed Yijia-Jiang closed 1 year ago

Yijia-Jiang commented 1 year ago

Hello, I am trying to use geofetch to download GSE files. I used the following command line geofetch -i GSE202280 --processed --geo-folder data/geotmp

I got an error stated below: Metadata folder: /GSE202280 Trying GSE202280 (not a file) as accession... Skipped 0 accessions. Starting now. Processing accession 1 of 1: 'GSE202280' Found previous GSE file: GSE202280/GSE202280_GSE.soft Found previous GSM file: GSE202280/GSE202280_GSM.soft Total number of processed SERIES files found is: 4 Expanding metadata list... Expanding metadata list... Finished processing 1 accession(s) No files found. No data to save. File /Users/GSE202280/GSE202280_samples/GSE202280_samples.csv won't be created

I also find the problem only occurs when I try to fetch processed data. Can you help?

khoroshevskyi commented 1 year ago

Hi! In GSE202280 there is no sample files (in Samples(GSM)), that's why there is warning: No files found. No data to save. File /Users/GSE202280/GSE202280_samples/GSE202280_samples.csv won't be created

To download processed files from Series you should add --data-source series argument. If you want to download data from Series and Samples, you can use --data-source all argument. Info from geofetch --help:

  --data-source {all,samples,series}
                        Optional: Specifies the source of data on the GEO
                        record to retrieve processed data, which may be
                        attached to the collective series entity, or to
                        individual samples. Allowable values are: samples,
                        series or both (all). Ignored unless 'processed' flag
                        is set. [Default: samples]

In your case full command will be: geofetch -i GSE202280 --processed --data-source series --geo-folder data/geotmp

Additionally, if you want to download metadata from samples (even if there is no file in Samples) use geofetch without --processed flag.

Let us know if it solved your problem.

khoroshevskyi commented 1 year ago

I assume that this problem was solved. Reopen this issue if it isn't like that.