FredHutch / Galeano-Nino-Bullman-Intratumoral-Microbiota_2022

Analysis code used in Galeano Nino et al., Impact of Intratumoral Microbiota on Spatial and Cellular Heterogeneity in human cancer. 2022
MIT License
33 stars 10 forks source link

FASTQ header mismatch detected #23

Open StackChan opened 2 weeks ago

StackChan commented 2 weeks ago
6
SRR21422687
90
4.50 G
1.17 Gb
SRX17427003
CRC_16_S1_L001_R2_001
Reads are like follows:
>gnl|SRA|SRR21422687.1.1VH00699:3:AAAHK77HV:1:2412:56765:48347 Biological (Biological)
AAGCAGTGGTATCAACGCAGAGTACATGGGGACAGACCCGGAGAGCACCGCGAGGGCGGAGCTGCGTTCTCCTCTGCACAGATTTCGGTG

8
SRR21422689
28
1.40 G
598.12 Mb
SRX17427001
CRC_16_S1_L001_R1_001
Reads are like follows:
>gnl|SRA|SRR21422689.1.1VH00699:3:AAAHK77HV:1:2412:56765:48347 Biological (Biological)
ATGCCATTTGCGACCAGAGAGGTGAATT

So when you run cellranger or spaceranger, you will encounter "FASTQ header mismatch detected" problems because their header is different and shall both be like .

>gnl|SRA|SRR21422687.1.1VH00699:3:AAAHK77HV:1:2412:56765:48347 Biological (Biological)

It really sucks and I guess it's because the author split a big SRA like CRC_16's SRA to 8 SRAs and their fastaq headers are renamed. Not so many people really reproduce the paper and report their issues here. The same problem is encountered by issue #15 .