singleron-RD / CeleScope

Single Cell Analysis Pipelines
https://www.singleron.bio/
MIT License
92 stars 31 forks source link

BCR Analysis with multi_vdj_full_len Failing at Step 03.assemble #292

Open GGN1999 opened 3 months ago

GGN1999 commented 3 months ago

I attempted to analyze BCR sequencing data generated using the GEXSCOPE® Single-Cell Immune Receptor Library Construction Kit with the multi_vdj_full_len tool. However, I encountered an error during the 03.assemble step.

Error Message:

V(D)J Chain detection failed for Sample xxx-full_len_vdj in "xxx/02.convert".

Total Reads = 1000000 Reads mapped to TR = 4 Reads mapped to IG = 142

In order to distinguish between the TR and the IG chain the following conditions need to be satisfied:

Are there any specific adjustments or parameters I can use to bypass or correct this error? Or is there another way to obtain filtered_contig.fasta and filtered_contig_annotations.csv files that are compatible with cellranger vdj outputs for subsequent BCR analysis using Change-O?

Version: celescope 1.8.1 cellranger 6.1.2

Thank you!

Chenjunjie1996 commented 3 months ago

As the message shown, the reads mapping to vdj gene is very low, you need check the sequencing data before analyzing. To overcome this error on the command line, set the --chain parameter to either IG for BCRs or TR for TCRs.

solution

GGN1999 commented 3 months ago

As the message shown, the reads mapping to vdj gene is very low, you need check the sequencing data before analyzing. To overcome this error on the command line, set the --chain parameter to either IG for BCRs or TR for TCRs.

solution

  • cellranger 6.1.2 is fine. update celescope to 1.16.1
  • add --other_param " --chain=IG " in shell script.

Thank you very much for your reply. I updated celescope to 1.16.1 and used flv_CR to run cellranger vdj on my GEXSCOPE® BCR sequencing data. -other_param " --chain=IG " was also added in shell script. However, I was again encountered with very similar error that said:

2024-08-29 15:14:08 [runtime] (failed) ID.PTCL_50_SHM-B-full_len_vdj.SC_VDJ_ASSEMBLER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.VDJ_CHEMISTRY_DETECTOR.DETECT_CHEMISTRY

[error] Pipestance failed. Error log at: PTCL_50_SHM-B-full_len_vdj/SC_VDJ_ASSEMBLER_CS/SC_MULTI_CORE/MULTI_CHEMISTRY_DETECTOR/_GEM_WELL_CHEMISTRY_DETECTOR/VDJ_CHEMISTRY_DETECTOR/DETECT_CHEMISTRY/fork0/chnk0-u0e23d01fbf/_errors

Log message: There were not enough reads to auto detect the chemistry: Sample PTCL_50_SHM-B-full_len_vdj in "/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/PTCL_50_shihuiming/multi_flv_CR/shell/PTCL_50_SHM-B-full_len_vdj/02.convert" Note that you can avoid auto detection by specifying the specific chemistry type and version.

Any possible solution to this error?

Chenjunjie1996 commented 3 months ago

This error might be caused by auto-detect chemistry step and the Valid Reads metric is abnormal. Could you provide the whole running message like following screenshot? image

GGN1999 commented 3 months ago

the whole running message: 2024-08-29 16:32:06,184 - celescope.tools.sample.sample - INFO - start... Args: Namespace(subparser_assay='flv_CR', outdir='.//PTCL_50_SHM-B-full_len_vdj/00.sample', sample='PTCL_50_SHM-B-full_len_vdj', thread='16', debug=False, fq1='/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/20231117_E150016801_U1119_DXB90-3/20231117_E150016801_U1119_DXB90-3/DXB90-3/E150016801_L01_DXB90-3_1.fq.gz', chemistry='auto', func=<function sample at 0x7f81673cd1f0>) 2024-08-29 16:32:06,187 - celescope.tools.sample.run - INFO - start... 2024-08-29 16:32:06,408 - celescope.tools.barcode.check_chemistry - INFO - start... 2024-08-29 16:32:06,409 - celescope.tools.barcode.check_chemistry - INFO - /dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/20231117_E150016801_U1119_DXB90-3/20231117_E150016801_U1119_DXB90-3/DXB90-3/E150016801_L01_DXB90-3_1.fq.gz 2024-08-29 16:32:06,409 - celescope.tools.barcode.get_chemistry - INFO - start... 2024-08-29 16:32:06,568 - celescope.tools.barcode.get_chemistry - INFO - [('scopeV2.2.1', 8872), ('scopeV2.1.1', 1873), ('flv_rna', 462), ('scopeV2.0.1', 4)] 2024-08-29 16:32:06,568 - celescope.tools.barcode.get_chemistry - INFO - chemistry: scopeV2.2.1 2024-08-29 16:32:06,568 - celescope.tools.barcode.get_chemistry - INFO - done. time used: 0:00:00.159119 2024-08-29 16:32:06,568 - celescope.tools.barcode.check_chemistry - INFO - done. time used: 0:00:00.159411 Sample ID: PTCL_50_SHM-B-full_len_vdj Assay: flv_CR Chemistry: scopeV2.2.1 (kit V1) Software Version: 1.16.1 2024-08-29 16:32:06,574 - celescope.tools.sample.run - INFO - done. time used: 0:00:00.386621 2024-08-29 16:32:06,574 - celescope.tools.step._clean_up - INFO - start... 2024-08-29 16:32:06,575 - celescope.tools.step._render_html - INFO - start... 2024-08-29 16:32:07,049 - celescope.tools.step._render_html - INFO - done. time used: 0:00:00.474010 2024-08-29 16:32:07,049 - celescope.tools.step._clean_up - INFO - done. time used: 0:00:00.475155 2024-08-29 16:32:07,049 - celescope.tools.sample.sample - INFO - done. time used: 0:00:00.864817 2024-08-29 16:32:09,115 - celescope.tools.barcode.barcode - INFO - start... Args: Namespace(subparser_assay='flv_CR', chemistry='auto', pattern=None, whitelist=None, linker=None, lowQual=0, lowNum=2, nopolyT=False, noLinker=False, filterNoPolyT=False, allowNoLinker=False, gzip=False, output_R1=False, fq1='/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/20231117_E150016801_U1119_DXB90-3/20231117_E150016801_U1119_DXB90-3/DXB90-3/E150016801_L01_DXB90-3_1.fq.gz', fq2='/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/20231117_E150016801_U1119_DXB90-3/20231117_E150016801_U1119_DXB90-3/DXB90-3/E150016801_L01_DXB90-3_2.fq.gz', match_dir=None, stdout=False, outdir='.//PTCL_50_SHM-B-full_len_vdj/01.barcode', sample='PTCL_50_SHM-B-full_len_vdj', thread='16', debug=False, func=<function barcode at 0x7f69143a3ee0>) 2024-08-29 16:32:09,334 - celescope.tools.barcode.check_chemistry - INFO - start... 2024-08-29 16:32:09,334 - celescope.tools.barcode.check_chemistry - INFO - /dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/20231117_E150016801_U1119_DXB90-3/20231117_E150016801_U1119_DXB90-3/DXB90-3/E150016801_L01_DXB90-3_1.fq.gz 2024-08-29 16:32:09,334 - celescope.tools.barcode.get_chemistry - INFO - start... 2024-08-29 16:32:09,397 - celescope.tools.barcode.get_chemistry - INFO - [('scopeV2.2.1', 8872), ('scopeV2.1.1', 1873), ('flv_rna', 462), ('scopeV2.0.1', 4)] 2024-08-29 16:32:09,397 - celescope.tools.barcode.get_chemistry - INFO - chemistry: scopeV2.2.1 2024-08-29 16:32:09,397 - celescope.tools.barcode.get_chemistry - INFO - done. time used: 0:00:00.062198 2024-08-29 16:32:09,397 - celescope.tools.barcode.check_chemistry - INFO - done. time used: 0:00:00.062389 2024-08-29 16:32:09,401 - celescope.tools.barcode.run - INFO - start... 2024-08-29 16:40:48,945 - celescope.tools.barcode.run - INFO - /dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/20231117_E150016801_U1119_DXB90-3/20231117_E150016801_U1119_DXB90-3/DXB90-3/E150016801_L01_DXB90-3_1.fq.gz finished. 2024-08-29 16:40:48,951 - celescope.tools.barcode.add_step_metrics - INFO - start... Raw Reads: 42,425,066 Valid Reads: 4,474(0.01%) Q30 of Barcodes: 96.19% Q30 of UMIs: 96.81% No PolyT Reads: 0(0.0%) Low Quality Reads: 0(0.0%) No Linker Reads: 2,712,399(6.39%) No Barcode Reads: 39,708,193(93.6%) Corrected Linker Reads: 3,041,237(7.17%) Corrected Barcode Reads: 4,473(0.01%) 2024-08-29 16:40:48,952 - celescope.tools.barcode.add_step_metrics - INFO - done. time used: 0:00:00.000255 2024-08-29 16:40:48,953 - celescope.tools.barcode.run - INFO - done. time used: 0:08:39.550611 2024-08-29 16:40:48,953 - celescope.tools.step._clean_up - INFO - start... 2024-08-29 16:40:48,954 - celescope.tools.step._render_html - INFO - start... 2024-08-29 16:40:49,281 - celescope.tools.step._render_html - INFO - done. time used: 0:00:00.326443 2024-08-29 16:40:49,281 - celescope.tools.step._clean_up - INFO - done. time used: 0:00:00.327964 2024-08-29 16:40:49,281 - celescope.tools.barcode.barcode - INFO - done. time used: 0:08:40.165687 Args: Namespace(subparser_assay='flv_CR', soft_path='/dssg/home/acct-medzwli/medzwli_qyr/software/cellranger/cellranger-7.0.1/cellranger', tenX_chemistry='V2', outdir='.//PTCL_50_SHM-B-full_len_vdj/02.convert', sample='PTCL_50_SHM-B-full_len_vdj', thread='16', debug=False, fq2='.//PTCL_50_SHM-B-full_len_vdj/01.barcode/PTCL_50_SHM-B-full_len_vdj_2.fq', func=<function convert at 0x7fa08f6f19d0>) 2024-08-29 16:40:51,644 - celescope.flv_CR.convert.gen_sgr_tenX_dict - INFO - start... 2024-08-29 16:40:51,651 - celescope.flv_CR.convert.gen_sgr_tenX_dict - INFO - done. time used: 0:00:00.006230 2024-08-29 16:40:51,651 - celescope.flv_CR.convert.write_fq1 - INFO - start... 2024-08-29 16:40:51,712 - celescope.flv_CR.convert.write_fq1 - INFO - done. time used: 0:00:00.060868 2024-08-29 16:40:51,712 - celescope.flv_CR.convert.gzip_fq2 - INFO - start... 2024-08-29 16:40:51,751 - celescope.flv_CR.convert.gzip_fq2 - INFO - done. time used: 0:00:00.039630 2024-08-29 16:40:51,752 - celescope.flv_CR.convert.dump_tenX_sgr_barcode_json - INFO - start... 2024-08-29 16:40:51,752 - celescope.tools.utils.dump_dict_to_json - INFO - start... 2024-08-29 16:40:51,752 - celescope.tools.utils.dump_dict_to_json - INFO - done. time used: 0:00:00.000277 2024-08-29 16:40:51,752 - celescope.flv_CR.convert.dump_tenX_sgr_barcode_json - INFO - done. time used: 0:00:00.000371 2024-08-29 16:40:51,752 - celescope.tools.step._clean_up - INFO - start... 2024-08-29 16:40:51,752 - celescope.tools.step._render_html - INFO - start... 2024-08-29 16:40:52,087 - celescope.tools.step._render_html - INFO - done. time used: 0:00:00.334438 2024-08-29 16:40:52,087 - celescope.tools.step._clean_up - INFO - done. time used: 0:00:00.335095 Args: Namespace(subparser_assay='flv_CR', ref_path='/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/refdata-cellranger-vdj-GRCh38-alts-ensembl-7.0.0', soft_path='/dssg/home/acct-medzwli/medzwli_qyr/software/cellranger/cellranger-7.0.1/cellranger', other_param=' --chain=IG ', mem='10', seqtype='BCR', not_refine=False, coeff=1.5, outdir='.//PTCL_50_SHM-B-full_len_vdj/03.assemble', sample='PTCL_50_SHM-B-full_len_vdj', thread='16', debug=False, fqs_dir='.//PTCL_50_SHM-B-full_len_vdj/02.convert', func=<function assemble at 0x7fa27598d280>) 2024-08-29 16:40:54,183 - celescope.flv_CR.assemble.assemble - INFO - start... 2024-08-29 16:40:54,183 - celescope.flv_CR.assemble.assemble - INFO - /dssg/home/acct-medzwli/medzwli_qyr/software/cellranger/cellranger-7.0.1/cellranger vdj --id=PTCL_50_SHM-B-full_len_vdj --reference=/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/refdata-cellranger-vdj-GRCh38-alts-ensembl-7.0.0 --fastqs=/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/multi_flv_CR/shell/PTCL_50_SHM-B-full_len_vdj/02.convert --sample=PTCL_50_SHM-B-full_len_vdj --localcores=16 --localmem=10 --chain=IG 2024-08-29 16:41:23,335 - celescope.tools.step._clean_up - INFO - start... Traceback (most recent call last): File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/assemble.py", line 79, in assemble runner.run() File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/assemble.py", line 72, in run self.assemble() File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper result = func(*args, **kwargs) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/assemble.py", line 67, in assemble subprocess.check_call(cmd, shell=True) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '/dssg/home/acct-medzwli/medzwli_qyr/software/cellranger/cellranger-7.0.1/cellranger vdj --id=PTCL_50_SHM-B-full_len_vdj --reference=/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/refdata-cellranger-vdj-GRCh38-alts-ensembl-7.0.0 --fastqs=/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/multi_flv_CR/shell/PTCL_50_SHM-B-full_len_vdj/02.convert --sample=PTCL_50_SHM-B-full_len_vdj --localcores=16 --localmem=10 --chain=IG ' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/bin/celescope", line 8, in sys.exit(main()) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main args.func(args) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/assemble.py", line 79, in assemble runner.run() File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/tools/step.py", line 263, in exit self._clean_up() File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper result = func(*args, kwargs) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/tools/step.py", line 235, in _clean_up self._write_stat() File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/tools/step.py", line 139, in _write_stat with open(self.stat_file, 'w') as writer: FileNotFoundError: [Errno 2] No such file or directory: './/PTCL_50_SHM-B-full_len_vdj/03.assemble/stat.txt' Args: Namespace(subparser_assay='flv_CR', seqtype='BCR', soft_path='/dssg/home/acct-medzwli/medzwli_qyr/software/cellranger/cellranger-7.0.1/cellranger', not_refine=False, outdir='.//PTCL_50_SHM-B-full_len_vdj/04.summarize', sample='PTCL_50_SHM-B-full_len_vdj', thread='16', debug=False, barcode_convert_json='.//PTCL_50_SHM-B-full_len_vdj/02.convert/barcode_convert.json', assemble_out='.//PTCL_50_SHM-B-full_len_vdj/03.assemble/PTCL_50_SHM-B-full_len_vdj/outs', func=<function summarize at 0x7f1e167e9940>) Traceback (most recent call last): File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/bin/celescope", line 8, in sys.exit(main()) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main args.func(args) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/summarize.py", line 109, in summarize with Summarize(args) as runner: File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/summarize.py", line 52, in init self.df_annotation = pd.read_csv(annotation_file, sep=',', index_col=None) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper return func(*args, kwargs) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 680, in read_csv return _read(filepath_or_buffer, kwds) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 575, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 933, in init__ self._engine = self._make_engine(f, self.engine) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1217, in _make_engine self.handles = get_handle( # type: ignore[call-overload] File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/common.py", line 789, in get_handle handle = open( FileNotFoundError: [Errno 2] No such file or directory: './/PTCL_50_SHM-B-full_len_vdj/03.assemble/PTCL_50_SHM-B-full_len_vdj_refine/outs/filtered_contig_annotations.csv' Args: Namespace(subparser_assay='flv_CR', seqtype='BCR', outdir='.//PTCL_50_SHM-B-full_len_vdj/05.match', sample='PTCL_50_SHM-B-full_len_vdj', thread='16', debug=False, match_dir='/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/multi-rna/shell/PTCL_50_SHM-5', summarize_out='.//PTCL_50_SHM-B-full_len_vdj/04.summarize', func=<function match at 0x7f67eb366ca0>) 2024-08-29 16:41:27,683 - celescope.tools.utils.get_barcode_from_match_dir - INFO - start... 2024-08-29 16:41:27,683 - celescope.tools.utils.get_matrix_dir_from_match_dir - INFO - start... 2024-08-29 16:41:27,701 - celescope.tools.utils.get_matrix_dir_from_match_dir - INFO - Matrix_dir :/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/multi-rna/shell/PTCL_50_SHM-5/05.count/PTCL_50_SHM-5_filtered_feature_bc_matrix 2024-08-29 16:41:27,701 - celescope.tools.utils.get_matrix_dir_from_match_dir - INFO - done. time used: 0:00:00.018533 2024-08-29 16:41:27,701 - celescope.tools.utils.get_barcode_from_matrix_dir - INFO - start... 2024-08-29 16:41:27,739 - celescope.tools.utils.get_barcode_from_matrix_dir - INFO - done. time used: 0:00:00.037460 2024-08-29 16:41:27,739 - celescope.tools.utils.get_barcode_from_match_dir - INFO - done. time used: 0:00:00.056312 2024-08-29 16:41:27,739 - celescope.flv_CR.match.run - INFO - start... 2024-08-29 16:41:27,739 - celescope.flv_CR.match.gen_matched_result - INFO - start... 2024-08-29 16:41:27,739 - celescope.tools.step._clean_up - INFO - start... 2024-08-29 16:41:27,740 - celescope.tools.step._render_html - INFO - start... 2024-08-29 16:41:28,067 - celescope.tools.step._render_html - INFO - done. time used: 0:00:00.327057 2024-08-29 16:41:28,067 - celescope.tools.step._clean_up - INFO - done. time used: 0:00:00.327699 Traceback (most recent call last): File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/bin/celescope", line 8, in sys.exit(main()) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main args.func(args) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/match.py", line 211, in match runner.run() File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper result = func(*args, kwargs) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/match.py", line 203, in run self.gen_matched_result() File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper result = func(*args, *kwargs) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/match.py", line 130, in gen_matched_result SGR_annotation_file = pd.read_csv(self.filter_annotation) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper return func(args, kwargs) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 680, in read_csv return _read(filepath_or_buffer, kwds) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 575, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 933, in init self._engine = self._make_engine(f, self.engine) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1217, in _make_engine self.handles = get_handle( # type: ignore[call-overload] File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/pandas/io/common.py", line 789, in get_handle handle = open( FileNotFoundError: [Errno 2] No such file or directory: './/PTCL_50_SHM-B-full_len_vdj/04.summarize/filtered_contig_annotations.csv' Args: Namespace(subparser_assay='flv_CR', outdir='.//PTCL_50_SHM-B-full_len_vdj/06.mapping', sample='PTCL_50_SHM-B-full_len_vdj', thread='16', debug=False, match_dir='/dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/multi-rna/shell/PTCL_50_SHM-5', match_out='.//PTCL_50_SHM-B-full_len_vdj/05.match', func=<function mapping at 0x7f10f66ac5e0>) Traceback (most recent call last): File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/bin/celescope", line 8, in sys.exit(main()) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main args.func(args) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/mapping.py", line 57, in mapping mapping_obj = Mapping(args, step_name) File "/dssg/home/acct-medzwli/medzwli_qyr/.conda/envs/convert_10X/lib/python3.9/site-packages/celescope/flv_CR/mapping.py", line 32, in init self.contig_file = glob.glob(f'{args.match_out}/matched_contig_annotations.csv')[0] IndexError: list index out of range

Chenjunjie1996 commented 3 months ago

Valid Reads metric is abnormal. You need check the fastqs whether is single-cell full length VDJ sequencing data. Running message means your input fastqs is auto detected to be "scopeV2.2.1" which is single-cell RNA-Seq sequencing data.

2024-08-29 16:32:06,408 - celescope.tools.barcode.check_chemistry - INFO - start...
2024-08-29 16:32:06,409 - celescope.tools.barcode.check_chemistry - INFO - /dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/20231117_E150016801_U1119_DXB90-3/20231117_E150016801_U1119_DXB90-3/DXB90-3/E150016801_L01_DXB90-3_1.fq.gz
2024-08-29 16:32:06,409 - celescope.tools.barcode.get_chemistry - INFO - start...
2024-08-29 16:32:06,568 - celescope.tools.barcode.get_chemistry - INFO - [('scopeV2.2.1', 8872), ('scopeV2.1.1', 1873), ('flv_rna', 462), ('scopeV2.0.1', 4)]
2024-08-29 16:32:06,568 - celescope.tools.barcode.get_chemistry - INFO - chemistry: scopeV2.2.1

2024-08-29 16:32:09,401 - celescope.tools.barcode.run - INFO - start...
2024-08-29 16:40:48,945 - celescope.tools.barcode.run - INFO - /dssg/home/acct-medzwli/medzwli_qyr/PTCL_single_cell/2023Ruiang/healthy_donor/PTCL_50_shihuiming/20231117_E150016801_U1119_DXB90-3/20231117_E150016801_U1119_DXB90-3/DXB90-3/E150016801_L01_DXB90-3_1.fq.gz finished.
2024-08-29 16:40:48,951 - celescope.tools.barcode.add_step_metrics - INFO - start...
Raw Reads: 42,425,066
Valid Reads: 4,474(0.01%)
GGN1999 commented 1 month ago

Thank you for your response. My single-cell data was generated using the GEXSCOPE VDJ KIT, and I would like to calculate the SHM rate and BCR isotype (Ig heavy-chain isotypes of TIBs based on their Ig constant regions). I attempted to use the Change-O kit for this, but the VDJ assay could not form the corresponding AIRR file, leading to failure. Is there any way I can still calculate the SHM rate and BCR isotype with my data?

Chenjunjie1996 commented 1 month ago

The AIRR file which is generated by multi_vdj is in the 04.mapping_vdj directory.

https://changeo.readthedocs.io/en/stable/examples/10x.html However, as the website shown. There are some differences between the column names in this file and the format required by the official website. I am not sure whether it will affect the analysis. If necessary, you can try to convert the column name.

GGN1999 commented 1 week ago

Thank you very much for your response. I would like to obtain filtered_contig.fasta and filtered_contig_annotations.csv to reprocess the VDJ data. I found filtered_contig_annotations.csv in the out directory, but I’m not sure how to obtain filtered_contig.fasta.

Chenjunjie1996 commented 1 week ago

The VDJ pipeline will not generate fasta file. *contig.fasta file typically records full length sequence which includes fwr1, cdr1, fwr2, cdr2, fwr3, cdr3, fwr4 for each assembled contig in the V(D)J library. Since VDJ pipeline has not gone through the assembly process, the sequence usually refers to the CDR3 sequence.

GGN1999 commented 1 week ago

Thank you for the clarification. I understand now. Would it be possible to use the consensus.fasta output from the VDJ pipeline to perform the assembly process with IgBLAST?

Chenjunjie1996 commented 1 week ago

As you mentioned before, your data was generated by GEXSCOPE VDJ KIT which can only capture the sequence of cdr3 region instead of the full length. Assembly process is used for single-cell full length VDJ pipeline. Consensus.fasta file is the consensus sequence of each assembled contig. It is identical to the sequence of the top (most frequent) exact subclonotype. The consensus sequence should be full-length (starting in the 5' UTR and ending at the C gene primer binding site). About outputs description of VDJ pipeline, you can refer to the single-cell VDJ assay page