FredHutch / Galeano-Nino-Bullman-Intratumoral-Microbiota_2022

Analysis code used in Galeano Nino et al., Impact of Intratumoral Microbiota on Spatial and Cellular Heterogeneity in human cancer. 2022
MIT License
33 stars 10 forks source link

Downloaded INVADEseq data not work for cellranger #15

Closed oyxf closed 1 year ago

oyxf commented 1 year ago

Dear Hanrui, @hanruiw

Thank you very much for your reply and help. When I setted -chemistry=SC5P-PE as your suggestion, an error appeared as follow :

artian Runtime - v4.0.6
Serving UI at http://k8s-scrna-6fcd99f6c8-bpmkh:41082?auth=9y-bj9FJXHA8L85mRe9rgLJ61tYsoY-4Wvs_5s042uQ

Running preflight checks (please wait)...
Checking sample info...
Checking FASTQ folder...
Checking reference...
Checking reference_path (/data/database/cellranger-refdata/refdata-gex-GRCh38-2020-A) on k8s-scrna-6fcd99f6c8-bpmkh...
Checking optional arguments...
mrc: v4.0.6

mrp: v4.0.6

Anaconda: Python 3.8.2

numpy: 1.19.2

scipy: 1.6.2

pysam: 0.16.0.1

h5py: 3.2.1

pandas: 1.2.4

STAR: 2.7.2a

samtools: samtools 1.10
Using htslib 1.10.2
Copyright (C) 2019 Genome Research Ltd.

2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.DISABLE_BAMS
2023-09-14 05:52:46 [runtime] (run:local)       ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.DISABLE_BAMS.fork0.chnk0.main
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MAKE_FULL_CONFIG._MAKE_VDJ_CONFIG
2023-09-14 05:52:46 [runtime] (run:local)       ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MAKE_FULL_CONFIG._MAKE_VDJ_CONFIG.fork0.chnk0.main
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.SANITIZE_MAP_CALLS
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.WRITE_GENE_INDEX
2023-09-14 05:52:46 [runtime] (run:local)       ID.OSCC_12.SC_RNA_COUNTER_CS.WRITE_GENE_INDEX.fork0.chnk0.main
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.FULL_COUNT_INPUTS.WRITE_GENE_INDEX
2023-09-14 05:52:46 [runtime] (run:local)       ID.OSCC_12.SC_RNA_COUNTER_CS.FULL_COUNT_INPUTS.WRITE_GENE_INDEX.fork0.chnk0.main
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY
2023-09-14 05:52:46 [runtime] (run:local)       ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY.fork0.chnk0.main
2023-09-14 05:52:46 [runtime] (chunks_complete) ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MAKE_FULL_CONFIG._MAKE_VDJ_CONFIG
2023-09-14 05:52:46 [runtime] (chunks_complete) ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.DISABLE_BAMS
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.MULTI_SETUP_CHUNKS
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.MULTI_SETUP_CHUNKS
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.MAKE_SHARD
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.MAKE_SHARD
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.BARCODE_CORRECTION
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.RUST_BRIDGE
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.BARCODE_CORRECTION
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.RUST_BRIDGE
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.ASSEMBLE_VDJ
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.ASSEMBLE_VDJ
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.MERGE_METRICS
2023-09-14 05:52:46 [runtime] (ready)           ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.MERGE_METRICS
2023-09-14 05:52:47 [runtime] (failed)          ID.OSCC_12.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY

[error] Pipestance failed. Error log at:
OSCC_12/SC_RNA_COUNTER_CS/SC_MULTI_CORE/MULTI_CHEMISTRY_DETECTOR/_GEM_WELL_CHEMISTRY_DETECTOR/DETECT_COUNT_CHEMISTRY/fork0/chnk0-u19f5029fae/_errors

Log message:
You selected chemistry SC5P-PE, which expects the cell barcode sequence in read1.
In the input data, an extremely low rate of correct barcodes was observed for this chemistry (0.0%).
Please check your input data and chemistry selection. Note: manual chemistry detection is not required in most cases.
Input: Sample OSCC_12 in "/oeK8S/test-INVADEseq-demo/raw_data/OSCC_12/OSCC_12_fastqs"

Waiting 6 seconds for UI to do final refresh.
Pipestance failed. Use --noexit option to keep UI running after failure.

2023-09-14 05:52:53 Shutting down

I used this command to convert OSCC_12 sra to fastq :

 /data/software/sratoolkit/sratoolkit-v2.10.2/bin/fastq-dump --split-files   SRR21429799

OSCC_12_fastqs:

ls -alt  ../../raw_data/OSCC_12/OSCC_12_fastqs 
lrwxrwxrwx 1 1003 1094 25 Sep  7 04:25 OSCC_12_S2_L001_R2_001.fastq.gz -> ../SRR21429799_2.fastq.gz
lrwxrwxrwx 1 1003 1094 25 Sep  7 04:25 OSCC_12_S2_L001_R1_001.fastq.gz -> ../SRR21429799_1.fastq.gz

And run cellranger cmd as follow :

 /data/software/cellranger/cellranger-6.1.1/bin/cellranger count   \
                 --id=OSCC_12  \
                 --transcriptome=/data/database/cellranger-refdata/refdata-gex-GRCh38-2020-A  \
                 --fastqs=../../raw_data/OSCC_12/OSCC_12_fastqs  \
                 --localcores=10  \
                 --localmem=64    \
                 --description=OSCC_12_human  \
                 --include-introns=true   \
                 --chemistry=SC5P-PE

Thanks in advance, oyxf

hanruiw commented 1 year ago

Hi oyxf,

For OSCC_12 invadeseq data (single-cell data with enriched bacterial sequence), could you please try: curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR181/079/SRR18185479/SRR18185479.fastq.gz -o OSCC_12_S5_L001_R2_001.fastq.gz curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR181/080/SRR18185480/SRR18185480.fastq.gz -o OSCC_12_S5_L001_R1_001.fastq.gz curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR181/081/SRR18185481/SRR18185481.fastq.gz -o OSCC_12_S5_L001_I1_001.fastq.gz

Hanrui

hanruiw commented 1 year ago

Hi oyxf,

I'm going to close this issue, please feel free to reopen it if you have any other questions!

Hanrui

oyxf commented 11 months ago

Sorry for the late reply。The new data works good. Thank you very much !!

elinorsa commented 7 months ago

Dear Hanrui, @hanruiw I encountered the same issue as oyxf, I tried running cellranger on the files downloaded in the way you suggested. I get the following error:

[error] Pipestance failed. Error log at:
OSCC_12/SC_RNA_COUNTER_CS/SC_MULTI_CORE/MULTI_CHEMISTRY_DETECTOR/DETECT_COUNT_CHEMISTRY/fork0/chnk0-uebcbc38319/_errors

Log message:
FASTQ header mismatch detected at line 4 of input files "/data/PRJNA811533_INVADEseq/OSCC12/OSCC_12_S5_L001_R1_001.fastq.gz" and "/data/PRJNA811533_INVADEseq/OSCC12/OSCC_12_S5_L001_R2_001.fastq.gz": file: "/data/PRJNA811533_INVADEseq/OSCC12/OSCC_12_S5_L001_R1_001.fastq.gz", line: 4

I run the following cellranger command:

cellranger count --id=OSCC_12 \
                 --transcriptome=/cellranger/refdata-gex-GRCh38-2020-A \
                 --fastqs=/data/PRJNA811533_INVADEseq/OSCC12 \
                 --sample=OSCC_12\
                 --include-introns true \
                 --localcores=8 \
                 --localmem=62\
                 --chemistry=SC5P-PE

Thanks in advance, Elinor

hanruiw commented 7 months ago

Dear Elinor,

Thank you for your interest in our paper. I think this might because OSCC_12 invadeseq data is associated with three SRR IDs (SRR18185479, SRR18185480, SRR18185481), thus read names in read files are not matched toward each other. Could you please try to match the read names and rerun CellRanger?

Best, Hanrui