theislab / single-cell-tutorial

Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"
1.39k stars 458 forks source link

Damaged tar archive #91

Closed sudoyang closed 2 years ago

sudoyang commented 2 years ago

Hey guys,

I tried several times to download the data following the command below,

wget ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE92nnn/GSE92332/suppl/GSE92332_RAW.tar

But I always got a corrupted tar file. I tried with filezilla, wget and aria2c but nothing works. Do you guys have any working file and it would be great if you can share it with me. Really appreciated!

$tar -xvf GSE92332_RAW.tar GSM2836573_Regional_Duo_M1_barcodes.tsv.gz GSM2836573_Regional_Duo_M1_genes.tsv.gz GSM2836573_Regional_Duo_M1_matrix.mtx.gz tar: Skipping to next header GSM2836577_Regional_Il_M1_barcodes.tsv.gz GSM2836577_Regional_Il_M1_genes.tsv.gz GSM2836577_Regional_Il_M1_matrix.mtx.gz tar: Skipping to next header GSM2839449_Atlas9_barcodes.tsv.gz GSM2839449_Atlas9_genes.tsv.gz GSM2839449_Atlas9_matrix.mtx.gz tar: Skipping to next header GSM2839461_RO_Control_Rep1_matrix.mtx.gz tar: Skipping to next header tar: Exiting with failure status due to previous errors

$gunzip Regional gzip: GSM2836573_Regional_Duo_M1_barcodes.tsv: unknown suffix -- ignored gzip: GSM2836573_Regional_Duo_M1_genes.tsv: unknown suffix -- ignored gzip: GSM2836573_Regional_Duo_M1_matrix.mtx.gz: invalid compressed data--format violated

sudoyang commented 2 years ago

I eventually found this link and use firefox to download the file. It works but the ftp way still does not work. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE92332

LuckyMD commented 2 years ago

Hi @sudoyang,

Thanks for reporting this, but it may be better reported to GEO. This is the official link to download the data from the database. I think I might have deleted my local copy of the data to save space (I thought it would be easy to get back ^^). Glad you found a workaround