Open yjzhang1020 opened 2 months ago
Take a screenshot of 01.data/beads_stat.txt content and the 01.data directory.
This is the content of the first 10 lines of the file 01.data/beads_stat.txt.
This is 01.data directory.
I have another question. The 'oligofastq' is a mandatory parameter, but the public dataset only provides R1 (30bp) and R2 (100bp) files. To make the program run, I have input the R1 file for both the 'oligofastq1' and 'oligofastq2' parameters. This way, the program can run normally and finish, and the output matrix can be used for subsequent analysis (only one sample failed). My question is, does the R1 library contain the oligo library information? Is it correct to run the program in this manner?
There are two pairs of sequences in the public data. The one with more data is the cDNA library, and the one with less data is the oligo library. They need to be distinguished.
The English translation of your text is:
Yes,I understand the need for distinction. For example, the public dataset only provides the HRR1445549_f1.fq.gz (30bp) file and the HRR1445549_r2.fq.gz (100bp) file. The program requires --cDNAfastq1 and --cDNAfastq2, which is easy to understand as the corresponding inputs should be the HRR1445549_f1.fq.gz (30bp) and HRR1445549_r2.fq.gz (100bp) files. However, the program also requires inputs for --oligofastq1 and --oligofastq2, and the public dataset does not provide the corresponding oligofastq files. So, I tried inputting HRR1445549_f1.fq.gz for the --oligofastq1 parameter and also HRR1445549_f1.fq.gz for the --oligofastq2 library. The program can run normally, but I want to know if this is the correct way to do it?
Is it a dataset problem? Is there a URL?
ref to this cngb data. https://db.cngb.org/search/sample/?q=CNP0005575
My test data comes from: NGDC - GSA for Human. The data used is scRNA_pool5-8.
I think this is not data that can be analyzed by dnbc4tools. Maybe you can refer to https://github.com/MGI-tech-bioinformatics/DNBelab_C_Series_scRNA-analysis-software
I think this is not data that can be analyzed by dnbc4tools. Maybe you can refer to https://github.com/MGI-tech-bioinformatics/DNBelab_C_Series_scRNA-analysis-software
其实 我想了解的是对于这个流程,--oligofastq 作为必须参数,在流程中起什么作用?以及我如何判断一个数据集是否能使用此流程?
MGI's single-cell RNA commercial reagents currently have two libraries, cDNA and oligo. The oligo library is used to merge multiple magnetic beads in the same droplet. If it is RNA that can be analyzed by dnbc4tools, then it has two libraries.
ref to this cngb data. https://db.cngb.org/search/sample/?q=CNP0005575
I checked the two sequencing data according to your link, and there is no Oligo fq file in them. Or did i miss something? So, what should my code do if I need to process this data?
$dnbc4tools rna run \ --cDNAfastq1 /path/to/E100062880_L01_11_1.fq.gz \ --cDNAfastq2 /path/to/E100062880_L01_11_1.fq.gz \ --oligofastq1 ? \ --oligofastq2 ? \ --genomeDir /database/scRNA/Mus_musculus/mm10 \ --name test --threads 10
ref to this cngb data. https://db.cngb.org/search/sample/?q=CNP0005575
I checked the two sequencing data according to your link, and there is no Oligo fq file in them. Or did i miss something? So, what should my code do if I need to process this data?
$dnbc4tools rna run --cDNAfastq1 /path/to/E100062880_L01_11_1.fq.gz --cDNAfastq2 /path/to/E100062880_L01_11_1.fq.gz --oligofastq1 ? --oligofastq2 ? --genomeDir /database/scRNA/Mus_musculus/mm10 --name test --threads 10
Unfortunately, I also didn't find the corresponding Oligo fq for this ref data provided by the author. Therefore, I am also very confused about which data can be processed by this process and which data cannot.
ref to this cngb data. https://db.cngb.org/search/sample/?q=CNP0005575
I checked the two sequencing data according to your link, and there is no Oligo fq file in them. Or did i miss something? So, what should my code do if I need to process this data?
$dnbc4tools rna run --cDNAfastq1 /path/to/E100062880_L01_11_1.fq.gz --cDNAfastq2 /path/to/E100062880_L01_11_1.fq.gz --oligofastq1 ? --oligofastq2 ? --genomeDir /database/scRNA/Mus_musculus/mm10 --name test --threads 10
barcode3 is cDNA fastq , barcode 11 is oligo fastq.
你好,我在运行dnbc4tools rna run遇到以下报错: 运行的代码是: dnbc4tools rna run --name scRNA_pool6 --cDNAfastq1 /home/data/t020559/DNBelab_test/PRJCA021248/st1_data/HRR1445549_f1.fq.gz --cDNAfastq2 /home/data/t020559/DNBelab_test/PRJCA021248/st1_data/HRR1445549_r2.fq.gz --oligofastq1 /home/data/t020559/DNBelab_test/PRJCA021248/st1_data/HRR1445549_f1.fq.gz --oligofastq2 /home/data/t020559/DNBelab_test/PRJCA021248/st1_data/HRR1445549_f1.fq.gz --genomeDir /home/data/t020559/ref/homo/homo_gencode_dnbc4_index --threads 8 遇到的报错如下: 2024-08-12 19:36:30 Calculating bead similarity and merging beads within the same droplet. 2024-08-12 19:36:31,535 - count - ERROR - Command failed with exit code 134 2024-08-12 19:36:31,536 - count - ERROR - similarity: main.c:369: create_index_array: Assertion `((infos)->n) > 0' failed. Aborted (core dumped)
Traceback (most recent call last): File "/home/data/t020559/miniconda3/envs/dnbc4tools/bin/dnbc4tools", line 8, in
sys.exit(main())
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/dnbc4tools.py", line 110, in main
args.func(args)
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/count.py", line 184, in count
Count(args).run()
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/count.py", line 58, in run
logging_call(similiarBeads_cmd_str,'count',self.outdir)
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/tools/utils.py", line 128, in logging_call
raise e
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/tools/utils.py", line 123, in logging_call
output = subprocess.check_output(popenargs, shell=True, stderr=subprocess.STDOUT, universal_newlines=True)
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/subprocess.py", line 415, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/software/similarity -n 8 scRNA_pool6 /home/data/t020559/DNBelab_test/PRJCA021248/st3_outs/scRNA_pool6/01.data/CB_UB_count.txt /home/data/t020559/DNBelab_test/PRJCA021248/st3_outs/scRNA_pool6/02.count/beads.barcodes.umi100.txt /home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/config/cellbarcode/oligo_type.txt /home/data/t020559/DNBelab_test/PRJCA021248/st3_outs/scRNA_pool6/02.count/similarity.all.csv /home/data/t020559/DNBelab_test/PRJCA021248/st3_outs/scRNA_pool6/02.count/similarity.droplet.csv /home/data/t020559/DNBelab_test/PRJCA021248/st3_outs/scRNA_pool6/02.count/similarity.dropletfiltered.csv' returned non-zero exit status 134.
Traceback (most recent call last):
File "/home/data/t020559/miniconda3/envs/dnbc4tools/bin/dnbc4tools", line 8, in
sys.exit(main())
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/dnbc4tools.py", line 110, in main
args.func(args)
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/run.py", line 144, in run
Runpipe(args).runpipe()
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/rna/run.py", line 131, in runpipe
start_print_cmd(pipecmd,os.path.join(self.outdir,self.name))
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/site-packages/dnbc4tools/tools/utils.py", line 138, in start_print_cmd
subprocess.check_call(arg, shell=True)
File "/home/data/t020559/miniconda3/envs/dnbc4tools/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '/home/data/t020559/miniconda3/envs/dnbc4tools/bin/dnbc4tools rna count --name scRNA_pool6 --calling_method emptydrops --expectcells 3000 --threads 8 --outdir /home/data/t020559/DNBelab_test/PRJCA021248/st3_outs' returned non-zero exit status 1.
令我疑惑的是,同时运行了4个样本,仅有这一个样本报错。其余3个样本可以正常输出结果。请问可以看出来是哪里的问题吗?