Open fbastian opened 6 years ago
Also, I see that we use SRA IDs to link to GEO in download files, but this doesn't work, see e.g. link http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=ERX012344 in ftp://ftp.bgee.org/bgee_v14_0/download/processed_expr_values/rna_seq/Mus_musculus/Mus_musculus_RNA-Seq_experiments_libraries.tar.gz, which should actually be https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE30617 from what I understand.
Do we have the information necessary during download file generation to fix this problem @smoretti?
We use SRX, ERX and DRX identifiers to ease download with the sra_toolkit. So the direct link for those identifiers should be https://www.ncbi.nlm.nih.gov/sra/?term=ERX012344
Column with GEO link should be removed to make the file more simple.
Only SRA link (after URL correction) should remain.
Implement an automatic verification of all links provided in the download files (we had problems of outdated URLs, or of missing files, that we only discovered after the files were released).