MGI-tech-bioinformatics / DNBelab_C_Series_HT_scRNA-analysis-software

An open source and flexible pipeline to analysis high-throughput DNBelab C Series single-cell RNA datasets
MIT License
52 stars 20 forks source link

Unable to complete cDNA mapping! #49

Closed shaidulberg closed 2 months ago

shaidulberg commented 6 months ago

Hello, I'm having a problem with my run. Here is what I run in my conda:

$conda activate /home/shaidu/miniconda3/envs/dnbc4tools
$cd /home/shaidu/Genomes/refdata-gex-mm10-2020-A
$dnbc4tools tools mkgtf --ingtf genes/genes.gtf --output genes.filter.gtf --type gene_type
$dnbc4tools rna mkref --ingtf genes.filter.gtf --fasta fasta/genome.fa --threads 30 --species Mus_musculus
$cd 
$dnbc4tools rna run \
    --cDNAfastq1 /home/shaidu/IBD_Project/V350157563_L01_1_1.fq.gz \
    --cDNAfastq2 /home/shaidu/IBD_Project/V350157563_L01_1_2.fq.gz \
    --oligofastq1 /home/shaidu/IBD_Project/V350157563_L01_5_1.fq.gz \
    --oligofastq2 /home/shaidu/IBD_Project/V350157563_L01_5_2.fq.gz \
    --genomeDir /home/shaidu/Genomes/refdata-gex-mm10-2020-A  \
    --name IBD --threads 10

This is the output:

The chemistry(darkreaction) automatically determined in cDNA : darkreaction
The chemistry(darkreaction) automatically determined in oligoR1 : darkreaction
The chemistry(darkreaction) automatically determined in oligoR2 : nodarkreaction

2024-01-01 10:27:59
Processing cDNA library barcodes and aligning.

2024-01-01 10:27:59
Processing oligo library barcodes.
2024-01-01 10:30:27,125 - data - ERROR
Command failed with exit code 1
2024-01-01 10:30:27,127 - data - ERROR
2024-1-1  10:28:0 ..... started STAR run
Jan 01 10:28:00 ..... loading genome
Jan 01 10:28:19 ..... started mapping
[E::bgzf_flush] File write failed (wrong size)
Error code: SAW-A10157
Error, bgzf_write error

Unable to complete cDNA mapping!
Traceback (most recent call last):
  File "/home/shaidu/miniconda3/envs/dnbc4tools/bin/dnbc4rna", line 8, in <module>
    sys.exit(main())
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/dnbc4rna.py", line 38, in main
    args.func(args)
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/data.py", line 73, in data
    Data(args).run()
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/data.py", line 43, in run
    process_libraries(self.outdir,
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/src/star_anno.py", line 392, in process_libraries
    raise Exception('Unable to complete cDNA mapping!')
Exception: Unable to complete cDNA mapping!
Traceback (most recent call last):
  File "/home/shaidu/miniconda3/envs/dnbc4tools/bin/dnbc4tools", line 8, in <module>
    sys.exit(main())
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/dnbc4tools.py", line 58, in main
    args.func(args)
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/run.py", line 105, in run
    Runpipe(args).runpipe()
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/run.py", line 92, in runpipe
    start_print_cmd(pipecmd,os.path.join(self.outdir,self.name))
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/tools/utils.py", line 38, in start_print_cmd
    subprocess.check_call(arg, shell=True)
  File "/home/shaidu/miniconda3/envs/dnbc4tools/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'dnbc4rna data --cDNAfastq1 /home/shaidu/IBD_Project/V350157563_L01_1_1.fq.gz --cDNAfastq2 /home/shaidu/IBD_Project/V350157563_L01_1_2.fq.gz --oligofastq1 /home/shaidu/IBD_Project/V350157563_L01_5_1.fq.gz --oligofastq2 /home/shaidu/IBD_Project/V350157563_L01_5_2.fq.gz --threads 10 --name IBD --chemistry auto --darkreaction auto --outdir /home/shaidu --genomeDir /home/shaidu/Genomes/refdata-gex-mm10-2020-A' returned non-zero exit status 1.

What is wrong with my cDNA? Thanx!

lishuangshuang0616 commented 6 months ago

Is it caused by insufficient storage space? @shaidulberg

shaidulberg commented 6 months ago

thank you @lishuangshuang0616, I will free up space and try to run it again. Could it be because I have 1 oligo fastq per cDNA? In the quick start, under section 2.2, there seem to be two oligos fastqs per cDNA fastq:

2.2 RUN Running the main workflow

$dnbc4tools rna run \
    --cDNAfastq1 /test/data/test_cDNA_R1.fastq.gz \
    --cDNAfastq2 /test/data/test_cDNA_R2.fastq.gz \
    --oligofastq1 /test/data/test_oligo1_1.fq.gz,/test/data/test_oligo2_1.fq.gz \
    --oligofastq2 /test/data/test_oligo1_2.fq.gz,/test/data/test_oligo2_2.fq.gz \
    --genomeDir /database/scRNA/Mus_musculus/mm10  \
    --name test --threads 10
lishuangshuang0616 commented 6 months ago

It's just a demo to show how to write multiple fastqs. It doesn't affect

shaidulberg commented 6 months ago

If I have the same samples on 4 lanes, do I run it like this:

$dnbc4tools rna run \
 --cDNAfastq1  V350157563_L01_1_1.fq.gz,V350157563_L02_1_1.fq.gz,V350157563_L03_1_1.fq.gz,V350157563_L04_1_1.fq.gz \
 --cDNAfastq2  V350157563_L01_1_2.fq.gz,V350157563_L02_1_2.fq.gz,V350157563_L03_1_2.fq.gz,V350157563_L04_1_2.fq.gz \
 --oligofastq1 V350157563_L01_5_1.fq.gz,V350157563_L02_5_1.fq.gz,V350157563_L03_5_1.fq.gz,V350157563_L04_5_1.fq.gz \
 --oligofastq2 V350157563_L01_5_2.fq.gz,V350157563_L02_5_2.fq.gz,V350157563_L03_5_2.fq.gz,V350157563_L04_5_2.fq.gz \
 --genomeDir refdata-gex-mm10-2020-A  \
 --name IBD \
 --threads 10

Or I should use the rna multi option?

lishuangshuang0616 commented 6 months ago

dnbc4tools rna run can only generate data from the same library, including multiple lanes or additional sequences. dnbc4tools rna multi is just a dnbc4tools rna run to quickly generate multiple samples