liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data
MIT License
283 stars 49 forks source link

failed: 256 at ./run-trust4 line 55. #251

Open liu9756 opened 8 months ago

liu9756 commented 8 months ago

Here is my code and the bug: $ ./run-trust4 -b all_contig.bam -f all_contig.fasta -o TRUST_all_contig_toassemble --ref mm39.fa --barcode CB -t 4 [Tue Mar 12 11:32:23 2024] TRUST4 begins. [Tue Mar 12 11:32:23 2024] SYSTEM CALL: /home/user/trust4/TRUST4/bam-extractor -b all_contig.bam -t 4 -f all_contig.fasta -o TRUST_all_contig_toassemble_toassemble --barcode CB [Tue Mar 12 11:32:23 2024] Start to extract candidate reads from bam file. Unknown genome name: GGGGTAATTGAAGTCAAGACTCAGCCTGGACATGATGTCCTCTGCTCAGTTCCTTGGTCTCCTGTTGCTCTGTTTTCAAGGTACCAGATGTGATATCCAGATGACACAGACTACATCCTCCCTGTCTGCCTCTCTGGGAGACAGAGTCACCATCAGTTGCAGGGCAAGTCAGGACATTAGCAATTATTTAAACTGGTATCAGCAGAAACCAGATGGAACTGTTAAACTCCTGATCTACTACACATCAAGATTACACTCAGGAGTCCCATCAAGGTTCAGTGGCAGTGGGTCTGGAACAGATTATTCTCTCACCATTAGCAACCTGGAGCAAGAAGATATTGCCACTTACTTTTGCCAACAGGGTAATACGCTTCCGTACACGTTCGGAGGGGGGACCAAGCTGGAAATAAAACGGGCTGATGCTGCACCAACTGTATCCATCTTCCCACCATCCAGTGAGCAGTTAACATCTGGAGGTGCCTCAGTCGTGTGCTTC system /home/user/trust4/TRUST4/bam-extractor -b all_contig.bam -t 4 -f all_contig.fasta -o TRUST_all_contig_toassemble_toassemble --barcode CB failed: 256 at ./run-trust4 line 55.

I checked my data and the data should not have problems

mourisl commented 8 months ago

The -f file hold the VDJ sequences from a reference genome along with their genomic coordinates at the header. The reference files for mouse can be found at: https://github.com/liulab-dfci/TRUST4/tree/master/mouse , where GRCm38_bcrtcr.fa is for -f, and the IMGT one is for --ref. Or do you need to use your own VDJ reference sequences?

liu9756 commented 8 months ago

Thanks for your reply. Actually I tried the GRCm38_bcrtcr.fa and IMGT , however it seems not work:

$ ./run-trust4 -b all_contig.bam -f GRCm38_bcrtcr.fa --ref mouse_IMGT+C.fa -o TRUST_all_contig_toassemble --barcode CB [Wed Mar 13 10:00:32 2024] TRUST4 begins. [Wed Mar 13 10:00:32 2024] SYSTEM CALL: /home/user/trust4/TRUST4/bam-extractor -b all_contig.bam -t 1 -f GRCm38_bcrtcr.fa -o TRUST_all_contig_toassemble_toassemble --barcode CB [Wed Mar 13 10:00:32 2024] Start to extract candidate reads from bam file. Unknown genome name: 6 system /home/user/trust4/TRUST4/bam-extractor -b all_contig.bam -t 1 -f GRCm38_bcrtcr.fa -o TRUST_all_contig_toassemble_toassemble --barcode CB failed: 256 at ./run-trust4 line 55.

mourisl commented 8 months ago

Could you please show me the chromosome names of your bam file by "samtools view -H all_contig.bam"?

liu9756 commented 8 months ago

@HD VN:1.6 SO:coordinate @SQ SN:AAACCTGAGACGCACA-1_contig_1 LN:496 @SQ SN:AAACCTGAGATAGTCA-1_contig_1 LN:488 @SQ SN:AAACCTGAGCGTTCCG-1_contig_1 LN:510 @SQ SN:AAACCTGAGGACAGCT-1_contig_1 LN:464 @SQ SN:AAACCTGAGGACAGCT-1_contig_2 LN:521 @SQ SN:AAACCTGCAAGCGCTC-1_contig_1 LN:499 @SQ SN:AAACCTGCAAGCGCTC-1_contig_2 LN:656 @SQ SN:AAACCTGCAAGCGCTC-1_contig_3 LN:503 @SQ SN:AAACCTGCACAACGCC-1_contig_1 LN:551 @SQ SN:AAACCTGCACGCCAGT-1_contig_1 LN:508 @SQ SN:AAACCTGCACGGCGTT-1_contig_1 LN:492 @SQ SN:AAACCTGCAGCGTCCA-1_contig_1 LN:503 @SQ SN:AAACCTGCATTACGAC-1_contig_1 LN:517 @SQ SN:AAACCTGCATTGCGGC-1_contig_1 LN:551 @SQ SN:AAACCTGGTACCATCA-1_contig_1 LN:303 @SQ SN:AAACCTGGTACCATCA-1_contig_2 LN:460 @SQ SN:AAACCTGGTATAGTAG-1_contig_1 LN:496 @SQ SN:AAACCTGTCAGTGTTG-1_contig_1 LN:503 @SQ SN:AAACCTGTCCGTCAAA-1_contig_1 LN:496 @SQ SN:AAACCTGTCCTAAGTG-1_contig_1 LN:559 @SQ SN:AAACCTGTCCTAAGTG-1_contig_2 LN:498 @SQ SN:AAACCTGTCGGCGCTA-1_contig_1 LN:538 @SQ SN:AAACCTGTCGGCGCTA-1_contig_2 LN:498 @SQ SN:AAACCTGTCTCCCTGA-1_contig_1 LN:496 @SQ SN:AAACCTGTCTCCCTGA-1_contig_2 LN:373 @SQ SN:AAACCTGTCTCTAGGA-1_contig_1 LN:497 @SQ SN:AAACCTGTCTGCTTGC-1_contig_1 LN:495 @SQ SN:AAACGGGAGCTGCAAG-1_contig_1 LN:389 @SQ SN:AAACGGGAGTGTTGAA-1_contig_1 LN:590 @SQ SN:AAACGGGAGTGTTGAA-1_contig_2 LN:506 @SQ SN:AAACGGGCAAACCCAT-1_contig_1 LN:551 @SQ SN:AAACGGGCAGCTGCAC-1_contig_1 LN:538 @SQ SN:AAACGGGCAGGTGGAT-1_contig_1 LN:495 @SQ SN:AAACGGGCATTATCTC-1_contig_1 LN:512 @SQ SN:AAACGGGTCAACCAAC-1_contig_1 LN:497 @SQ SN:AAACGGGTCACAACGT-1_contig_1 LN:342 @SQ SN:AAACGGGTCAGAGCTT-1_contig_1 LN:493 @SQ SN:AAACGGGTCCACGTTC-1_contig_1 LN:527 @SQ SN:AAACGGGTCGTGGACC-1_contig_1 LN:309 @SQ SN:AAACGGGTCTAACTCT-1_contig_1 LN:428 @SQ SN:AAACGGGTCTTGTATC-1_contig_1 LN:565 @SQ SN:AAAGATGAGTTCGATC-1_contig_1 LN:538 @SQ SN:AAAGATGCAAGAGTCG-1_contig_1 LN:467 @SQ SN:AAAGATGCAAGGTTTC-1_contig_1 LN:551 @SQ SN:AAAGATGCAGATGAGC-1_contig_1 LN:684 @SQ SN:AAAGATGCAGATGAGC-1_contig_2 LN:512 @SQ SN:AAAGATGCATGAACCT-1_contig_1 LN:512 @SQ SN:AAAGATGGTAAATGAC-1_contig_1 LN:501 @SQ SN:AAAGATGGTATCTGCA-1_contig_1 LN:620 @SQ SN:AAAGATGGTCACACGC-1_contig_1 LN:512 @SQ SN:AAAGATGTCAAACGGG-1_contig_1 LN:503 @SQ SN:AAAGATGTCCCACTTG-1_contig_1 LN:559 @SQ SN:AAAGATGTCCCACTTG-1_contig_2 LN:410 @SQ SN:AAAGATGTCGGGAGTA-1_contig_1 LN:481 @SQ SN:AAAGATGTCTTGTCAT-1_contig_1 LN:493 @SQ SN:AAAGCAAAGAAGATTC-1_contig_1 LN:495 @SQ SN:AAAGCAAAGCCACGTC-1_contig_1 LN:495 @SQ SN:AAAGCAAAGGAGTACC-1_contig_1 LN:521 @SQ SN:AAAGCAAAGTGCCAGA-1_contig_1 LN:512 @SQ SN:AAAGCAACAGCCTATA-1_contig_1 LN:521 @SQ SN:AAAGCAAGTAAGTTCC-1_contig_1 LN:505 @SQ SN:AAAGCAAGTTGTGGCC-1_contig_1 LN:627 @SQ SN:AAAGCAATCACATGCA-1_contig_1 LN:501 @SQ SN:AAAGCAATCAGGCCCA-1_contig_1 LN:504 @SQ SN:AAAGCAATCCCTAATT-1_contig_1 LN:532 @SQ SN:AAAGCAATCCTGTACC-1_contig_1 LN:512 @SQ SN:AAAGCAATCGCCTGTT-1_contig_1 LN:494 @SQ SN:AAAGCAATCTGAGTGT-1_contig_1 LN:503 @SQ SN:AAAGTAGAGACTACAA-1_contig_1 LN:559 @SQ SN:AAAGTAGAGACTAGGC-1_contig_1 LN:510 @SQ SN:AAAGTAGAGAGTGAGA-1_contig_1 LN:494 @SQ SN:AAAGTAGAGATCACGG-1_contig_1 LN:500 @SQ SN:AAAGTAGAGCGTGAAC-1_contig_1 LN:407 @SQ SN:AAAGTAGAGGCATGGT-1_contig_1 LN:684 @SQ SN:AAAGTAGAGGCATGGT-1_contig_2 LN:512 @SQ SN:AAAGTAGCAATAACGA-1_contig_1 LN:512 @SQ SN:AAAGTAGCACCAGGCT-1_contig_1 LN:567 @SQ SN:AAAGTAGCACGTCAGC-1_contig_1 LN:561 ......

@PG ID:samtools PN:samtools VN:1.16.1 CL:samtools sort -l 8G -m 600M -o /home/user/referenceData/run_vdj_S5/SC_VDJ_ASSEMBLER_CS/SC_MULTI_CORE/MULTI_GEM_WELL_PROCESSOR/VDJ_B_GEM_WELL_PROCESSOR/SC_VDJ_CONTIG_ASSEMBLER/ASSEMBLE_VDJ/fork0/chnk00-uf0fee24a37/files/contig_bam_sorted.bam /home/user/referenceData/run_vdj_S5/SC_VDJ_ASSEMBLER_CS/SC_MULTI_CORE/MULTI_GEM_WELL_PROCESSOR/VDJ_B_GEM_WELL_PROCESSOR/SC_VDJ_CONTIG_ASSEMBLER/ASSEMBLE_VDJ/fork0/chnk00-uf0fee24a37/files/contig_bam.bam @PG ID:samtools.1 PN:samtools PP:samtools VN:1.16.1 CL:samtools merge -@ 3 -c -p -s 0 -b /home/user/referenceData/run_vdj_S5/SC_VDJ_ASSEMBLER_CS/SC_MULTI_CORE/MULTI_GEM_WELL_PROCESSOR/VDJ_B_GEM_WELL_PROCESSOR/SC_VDJ_CONTIG_ASSEMBLER/ASSEMBLE_VDJ/fork0/join-uf0fee24a37/files/contig_bam.fofn /home/user/referenceData/run_vdj_S5/SC_VDJ_ASSEMBLER_CS/SC_MULTI_CORE/MULTI_GEM_WELL_PROCESSOR/VDJ_B_GEM_WELL_PROCESSOR/SC_VDJ_CONTIG_ASSEMBLER/ASSEMBLE_VDJ/fork0/join-uf0fee24a37/files/contig_bam.0.bam @PG ID:samtools.2 PN:samtools PP:samtools.1 VN:1.13 CL:samtools view -H all_contig.bam

mourisl commented 8 months ago

I think the BAM file is from the alignment of the read to each BCR contig. The bam file for TRUST4 should be the alignment to the reference genome. Just curious, since your data already has cellranger vdj results, why you need to run TRUST4 on the data? Thank you.

liu9756 commented 8 months ago

I am trying to get some SHM analysis by TRUST4

mourisl commented 8 months ago

The cellranger vdj probably already contains enough information for SHM analysis in the AIRR file. If you need to use TRUST4 from the beginning, I think using the VDJ fastq file is more convenient.