minoda-lab / universc

UniverSC: a flexible cross-platform single-cell data processing pipeline
https://genomec.gsc.riken.jp/gerg/UniverSC/UniverSC_app_release/
GNU General Public License v3.0
41 stars 7 forks source link

Error in BD rhasody analysis #20

Open g0656116 opened 6 months ago

g0656116 commented 6 months ago

First of all, thank you for your quick response to the last issue. As you answered, I changed to and installed cellranger version 3.0.2 and confirmed that universc was running well with 10x data. And we ran our BD rhapsody data that we originally intended to analyze. However, an error message was displayed and I would like to seek help. We have single cell rnaseq data to analyze from different sources, so we would like to set up this program to analyze that data.

Please give us feedback and we will make changes quickly! Thank you for your help.

universc

song@Song:~/Downloads/universc$ bash launch_universc.sh --id "MJ" --technology "bd-rhapsody" --reference "test/cellranger_reference/cellranger-tiny-ref/3.0.0" --read1 "/home/song/Raw_data/NGS230926_S1_L001_R1_001" --read2 "/home/song/Raw_data/NGS230926_S1_L001_R2_001" --localcores 16 --localmem 64

log

script running in /home/song/Downloads/universc/launch_universc.sh... ... script called from /home/song/Downloads/universc '.' resolves to '/home/song/Downloads/universc' Running launch_universc.sh in '/home/song/Downloads/universc' UniverSC Copyright (C) 2019 Tom Kelly; Kai Battenberg This program comes with ABSOLUTELY NO WARRANTY; for details type 'cat LICENSE'. This is free software, and you are welcome to redistribute it under certain conditions; type 'cat LICENSE' for details. Cell Ranger is called as third-party dependency and is not maintained by this project. Please ensure you comply with the End User License Agreement for all software installed where applicable; for details type 'cat README.md'. /home/song/Raw_data/NGS230926_S1_L001_R1_001.fastq file found /home/song/Raw_data/NGS230926_S1_L001_R2_001.fastq file found Using 10x version 2 chemistry to support UMIs WARNING: conversion was turned on because directory input4cellranger_MJ was not found checking if UniverSC is running already creating .lock file

Input information

SETUP and exit: false FORMAT: bd-rhapsody BARCODES: /home/song/Downloads/universc/whitelists/bd_rhapsody_barcode.txt INPUT(R1): /home/song/Raw_data/NGS230926_S1_L001_R1_001.fastq INPUT(R2): /home/song/Raw_data/NGS230926_S1_L001_R2_001.fastq SAMPLE: NGS230926 LANE: 1 ID: MJ DESCRIPTION: MJ WARNING: no description given, setting to ID value REFERENCE: test/cellranger_reference/cellranger-tiny-ref/3.0.0 NCELLS: (no cell number given) CHEMISTRY: SC3Pv2 JOBMODE: local WARNING: --jobmode "sge" is recommended if running script with qsub CONVERSION: true ##########

whitelist setup begin updating barcodes in /home/song/Downloads/cellranger-3.0.2/cellranger-cs/3.0.2/lib/python/cellranger/barcodes for Cell Ranger version 3.0.2 installed in /home/song/Downloads/cellranger-3.0.2/cellranger ... restoring Cell Ranger sed: can't read /home/song/Downloads/cellranger-3.0.2/cellranger-cs/3.0.2/lib/python/cellranger/check.py: No such file or directory /home/song/Downloads/cellranger-3.0.2/cellranger set for bd-rhapsody converting whitelist barcode adjust: 0 whitelist converted verbose setup complete running in local mode (no cluster configuration needed) creating a folder for all Cell Ranger input files ... directory input4cellranger_MJ created for converted files moving file to new location handling /home/song/Raw_data/NGS230926_S1_L001_R1_001.fastq ... handling /home/song/Raw_data/NGS230926_S1_L001_R2_001.fastq ... converting input files to confer cellranger format ... adjustment parameters: barcodes: 0 bp at its head UMIs: -2 bp at its tail making technology-specific modifications ... ... remove adapter and phase blocks for bd-rhapsody ... remove adapter and phase blocks for bd-rhapsody adjusting barcodes of R1 files adjusting UMIs of R1 files handling input4cellranger_MJ/NGS230926_S1_L001_R1_001.fastq ... input4cellranger_MJ/NGS230926_S1_L001_R1_001.fastq adjusted running Cell Ranger ...

Cell Ranger command

cellranger count --id=MJ\ --fastqs=input4cellranger_MJ\ --lanes=1\ --r1-length=37\ --chemistry=SC3Pv2\ --transcriptome=test/cellranger_reference/cellranger-tiny-ref/3.0.0\ --sample=NGS230926\ --description=MJ\ \ --jobmode=local\ --localcores=16\ --localmem=64\ \ \ \ \ \

########## /home/song/Downloads/cellranger-3.0.2/cellranger-cs/3.0.2/bin cellranger count (3.0.2) Copyright (c) 2019 10x Genomics, Inc. All rights reserved.

Martian Runtime - '3.0.2-v3.2.0' Serving UI at http://Song:45413?auth=9eYoZpceP9eIRzDEKnN6paxvau07PuEqfzYXPETJqp4

Running preflight checks (please wait)... 2023-12-27 04:45:20 [runtime] (ready) ID.MJ.SC_RNA_COUNTER_CS.EXPAND_SAMPLE_DEF 2023-12-27 04:45:20 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.EXPAND_SAMPLE_DEF.fork0.chnk0.main 2023-12-27 04:45:21 [runtime] (chunks_complete) ID.MJ.SC_RNA_COUNTER_CS.EXPAND_SAMPLE_DEF Checking sample info... Checking FASTQ folder... Checking reference... Checking reference_path (/home/song/Downloads/universc/test/cellranger_reference/cellranger-tiny-ref/3.0.0) on Song... Checking chemistry... Checking read 1 length... Checking optional arguments... mrc: '3.0.2-v3.2.0'

mrp: '3.0.2-v3.2.0'

Anaconda: Python 2.7.14 :: Anaconda, Inc.

numpy: 1.14.2

scipy: 1.0.1

pysam: 0.14.1

h5py: 2.8.0

pandas: 0.22.0

STAR: STAR_2.5.1b

samtools: samtools 1.7 Using htslib 1.7 Copyright (C) 2018 Genome Research Ltd.

2023-12-27 04:45:22 [runtime] (ready) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHEMISTRY_DETECTOR.DETECT_CHEMISTRY 2023-12-27 04:45:22 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHEMISTRY_DETECTOR.DETECT_CHEMISTRY.fork0.chnk0.main 2023-12-27 04:45:22 [runtime] (ready) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION 2023-12-27 04:45:22 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION.fork0.chnk0.main 2023-12-27 04:45:22 [runtime] (ready) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.DISABLE_FEATURE_STAGES 2023-12-27 04:45:22 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.DISABLE_FEATURE_STAGES.fork0.chnk0.main 2023-12-27 04:45:22 [runtime] (chunks_complete) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION 2023-12-27 04:45:22 [runtime] (chunks_complete) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.DISABLE_FEATURE_STAGES 2023-12-27 04:45:23 [runtime] (chunks_complete) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHEMISTRY_DETECTOR.DETECT_CHEMISTRY 2023-12-27 04:45:23 [runtime] (ready) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SETUP_CHUNKS 2023-12-27 04:45:23 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SETUP_CHUNKS.fork0.chnk0.main 2023-12-27 04:45:24 [runtime] (chunks_complete) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SETUP_CHUNKS 2023-12-27 04:45:24 [runtime] (ready) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHECK_BARCODES_COMPATIBILITY 2023-12-27 04:45:24 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHECK_BARCODES_COMPATIBILITY.fork0.split 2023-12-27 04:45:24 [runtime] (ready) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER._BASIC_SC_RNA_COUNTER.CHUNK_READS 2023-12-27 04:45:24 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER._BASIC_SC_RNA_COUNTER.CHUNK_READS.fork0.split 2023-12-27 04:45:24 [runtime] (split_complete) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER._BASIC_SC_RNA_COUNTER.CHUNK_READS 2023-12-27 04:45:24 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER._BASIC_SC_RNA_COUNTER.CHUNK_READS.fork0.chnk0.main 2023-12-27 04:45:24 [runtime] (split_complete) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHECK_BARCODES_COMPATIBILITY 2023-12-27 04:45:24 [runtime] (run:local) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHECK_BARCODES_COMPATIBILITY.fork0.join 2023-12-27 04:45:25 [runtime] (join_complete) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHECK_BARCODES_COMPATIBILITY 2023-12-27 04:51:26 [runtime] (update) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER._BASIC_SC_RNA_COUNTER.CHUNK_READS.fork0 chunks_running 2023-12-27 04:51:43 [runtime] (failed) ID.MJ.SC_RNA_COUNTER_CS.SC_RNA_COUNTER._BASIC_SC_RNA_COUNTER.CHUNK_READS

[error] Pipestance failed. Error log at: MJ/SC_RNA_COUNTER_CS/SC_RNA_COUNTER/_BASIC_SC_RNA_COUNTER/CHUNK_READS/fork0/chnk0-ub5b28b2d54/_errors

Log message: Traceback (most recent call last): File "/home/song/Downloads/cellranger-3.0.2/martian-cs/v3.2.0/adapters/python/martian_shell.py", line 590, in _main stage.main() File "/home/song/Downloads/cellranger-3.0.2/martian-cs/v3.2.0/adapters/python/martian_shell.py", line 555, in main self._run(lambda: self._module.main(args, outs)) File "/home/song/Downloads/cellranger-3.0.2/martian-cs/v3.2.0/adapters/python/martian_shell.py", line 524, in _run cmd() File "/home/song/Downloads/cellranger-3.0.2/martian-cs/v3.2.0/adapters/python/martian_shell.py", line 555, in self._run(lambda: self._module.main(args, outs)) File "/home/song/Downloads/cellranger-3.0.2/cellranger-cs/3.0.2/mro/stages/common/chunk_reads/init.py", line 53, in main tk_subproc.check_call(chunk_reads_args) File "/home/song/Downloads/cellranger-3.0.2/cellranger-cs/3.0.2/tenkit/lib/python/tenkit/log_subprocess.py", line 37, in check_call return subprocess.check_call(*args, **kwargs) File "/home/song/Downloads/cellranger-3.0.2/miniconda-cr-cs/4.3.21-miniconda-cr-cs-c10/lib/python2.7/subprocess.py", line 186, in check_call raise CalledProcessError(retcode, cmd) CalledProcessError: Command '['chunk_reads', '--reads-per-fastq', '5000000', '/home/song/Downloads/universc/MJ/SC_RNA_COUNTER_CS/SC_RNA_COUNTER/_BASIC_SC_RNA_COUNTER/CHUNK_READS/fork0/chnk0-ub5b28b2d54/files/', 'fastq_chunk', '--martian-args', 'chunk_args.json', '--compress', 'lz4']' returned non-zero exit status 1

Waiting 6 seconds for UI to do final refresh. Pipestance failed. Use --noexit option to keep UI running after failure.

2023-12-27 04:51:49 Shutting down. Saving pipestance info to MJ/MJ.mri.tgz For assistance, upload this file to 10x Genomics by running:

cellranger upload MJ/MJ.mri.tgz

cellranger run complete ***Notice: Cloupe file cannot be computed for bd-rhapsody Cloupe files generated by this pipeline are corrupt and cannot be read by the 10x Genomics Loupe Browser. We do not provide support for Cloupe files as this requires software from 10x Genomics subject to their End User License Agreement. Cloupe files are disabled in compliance with this. updating .lock file no other jobs currently run by cellranger 3.0.2 in /home/song/Downloads/cellranger-3.0.2/cellranger no conflicts: whitelist can now be changed for other technologies replacing modified barcodes with the original in the output gene barcode matrix gzip: MJ/outs/raw_feature_bc_matrix/barcodes.tsv.gz: No such file or directory gzip: MJ/outs/filtered_feature_bc_matrix/barcodes.tsv.gz: No such file or directory sh: 1: cannot create MJ/outs/raw_feature_bc_matrix/barcodes.tsv.gz: Directory nonexistent sh: 1: cannot create MJ/outs/filtered_feature_bc_matrix/barcodes.tsv.gz: Directory nonexistent barcodes recovered

Conversion tool log

cellranger 3.0.2

Original barcode format: bd-rhapsody (then converted to 10x)

cellranger runtime: 389s ##########

Cellranger PATH

/home/song/Downloads/cellranger-3.0.2

UniverSC PATH

/home/song/Downloads/universc

fastq DATA PATH

/home/song/Raw_data

After running universc, the same fastq data was created in /home/song/Downloads/universc/input4cellranger_MJ.

kbattenb commented 6 months ago

Hi,

Off hand, this seems like a cellranger problem rather than a UniverSC problem. Personally I have had problems installing cellranger 3.0.2. This may be a silly question, but have you tried running cellranger 3.0.2 on a small 10x dataset that you know would work to make sure that cellranger 3.0.2 is running correctly?

g0656116 commented 6 months ago

Thank you for your reply! Yes, Cellranger was installed well, and after checking with 10x data, I confirmed that the result file came out well.

kbattenb commented 6 months ago

So based on what I see, it really looks like UniverSC is running properly (at least superficially) and its an issue of running cellranger. I can think of two things to try.

  1. (Probably not the cause but just in case) Can you look in input4cellranger_MJ to confirm that the barcodes in R1 file are in fact valid barcodes? I would like to make sure that the 10x-like R1 file is correctly converted.
  2. (I am worried this might be the case) Memory issues. It seems you are running UniverSC locally. I have had issues giving UniverSC 64G of RAM on a computer with 64G of RAM to crash it. In my case it ran once when I gave it 50G of RAM (counter intuitive I know). If that fails, and this might be a tall order, can you give it even more RAM if your computer has such capacity?
TomKellyGenetics commented 4 months ago

Sorry for the late response: BD Rhapsody parameters were updated in the last release. This issue can occasionally occur when importing FASTQ files that were not created correctly.

Please check the input files in the "input4cellranger_MJ" directory to ensure they are in the correct format (fastq or fastq.gz) and have the same number of lines for R1 and R2 files. It would also be helpful to see the 1st 8-12 lines of each file to confirm that sequence and quality scores match after adapters were removed.

I have an updated version ready to release to support other technologies. It is possible this will resolve this problem but I cannot be certain. If you still have trouble running it please let us know!

TomKellyGenetics commented 4 months ago

Closes issue #18 and #19 (which appear to be the same problem)

TomKellyGenetics commented 4 months ago

It appears to be a mismatch between reads 1 and 2. I've tested on published data from #13. Read 1 is converted but read 2 is not.

MWE

$ ./software/sratoolkit.2.10.8-centos_linux64/bin/fastq-dump --origfmt --split-files SRR5185896
 $ ./launch_universc.sh -R1 SRR5185896_sra_S1_L001_R1_001.fastq -R2 SRR5185896_sra_S1_L001_R2_001.fastq -t bd-rhapsody -r ./software/cellranger-3.0.2/cellranger-tiny-ref -i test-bd --verbose
$ls --color=tty -tlhr input4cellranger_test-bd/SRR5185896_sra_S1_L001_*
-rw-rw-r-- 1 user group  40M Feb 25 12:23 input4cellranger_test-bd/SRR5185896_sra_S1_L001_R2_001.fastq
-rw-rw-r-- 1 user group 496K Feb 25 12:23 input4cellranger_test-bd/SRR5185896_sra_S1_L001_R1_001.fastq                              /0.1s
$ wc input4cellranger_test-bd/SRR5185896_sra_S1_L001_*
   12383    12384   507904 input4cellranger_test-bd/SRR5185896_sra_S1_L001_R1_001.fastq
 1000000  1000000 41595258 input4cellranger_test-bd/SRR5185896_sra_S1_L001_R2_001.fastq
 1012383  1012384 42103162 total

I'm still investigating the root cause of this error.