czbiohub-sf / tabula-muris

Code and annotations for the Tabula Muris single-cell transcriptomic dataset.
https://www.nature.com/articles/s41586-018-0590-4
BSD 3-Clause "New" or "Revised" License
185 stars 90 forks source link

Sample 10X_P7_7 causing cellranger error #231

Closed gouinK closed 3 years ago

gouinK commented 3 years ago

Hi all, I have processed several of your samples with cellranger using your fastq from AWS without any issues, however I ran into an issue with sample 10X_P7_7 where cellranger produces this error:

Log message: IO error in FASTQ file '"/common/gouink/Tabula_Muris_data/rawdata/10X_P7_7_S8_L001_R2_001.fastq.gz"', line: 373246356: unexpected end of file

I have tried deleting the fastq files and re-downloading them, but I receive the same error. I have also checked this on both cellranger v4 and cellranger v5, with the same error happening. Perhaps this fastq file was not uploaded completely to aws and is therefore shorter than its mate?

aopisco commented 3 years ago

@jamestwebber any insights here?

jamestwebber commented 3 years ago

I don't think there's anything wrong with the file on S3. Can you run these commands on the downloaded file to make sure we have the same thing?

$ ls -l 10X_P7_7_S8_L001_R2_001.fastq.gz
-rw-r--r-- 1 james czb 5735445924 Aug 14  2019 10X_P7_7_S8_L001_R2_001.fastq.gz

$ sha256sum 10X_P7_7_S8_L001_R2_001.fastq.gz 
2578d430bcfa909bc6495b122e32af152badb30e962fbdb176108478f33841fd 10X_P7_7_S8_L001_R2_001.fastq.gz

$ zcat 10X_P7_7_S8_L001_R2_001.fastq.gz | wc -l
434910224

The number of lines matches the R1, so I think the copies on S3 are okay. Is it possible that your connection fails while trying to download the R2 file because it's too large?

Portulaca666 commented 3 years ago

so ,how does this be solved ?

denvercal1234GitHub commented 2 years ago

Hi @aopisco --- Have you solved or identified the issue with this file? Thank you in advance!