liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data
MIT License
269 stars 46 forks source link

failed: 256 at ./run-trust4 line 57. #286

Open origami974 opened 1 month ago

origami974 commented 1 month ago

Hello, I am a TRUST4 beginner, so the questions may be quite basic. I hope you can patiently answer them. In my studies, I encountered some situations where when I read a directly downloaded Bam file from the 10X platform, a Bam file like:@HD VN:1.4 S0:coordinate 2 @SQ SN:chrl LN:248956422 could be successfully read, but could not: VN:1.6 S0:coordinate SN:AAACCTGAGAGAACAG-1 contig 1LN:306 The error type was: Unknown genome name: chrl system /data/TRUST/TRUST4/bam-extractor -b data input/l0k Pbmc.bam -t 1 -f hg38 bcrtcr.fa -0 TRUST 10K PBMC oassemble failed: 256 at ./run-trust4 line 57.

I noticed that the sample data you provided us is consistent with the former, so I would like to ask if there are any errors in the data format of the latter, In addition, I also encountered some issues when reading SRR data after converting it to Bam. The file content is as follows: SRR28216602.1746353 393 chr1 14817 0 13M140N13M 0 0 TTCCCAGAGATGCCCTTGCGCCTCAT AAAAAEEEEEEEEEEEEEEEEEEEEE NH:i:8 HI:i:4 AS:i:26 nM:i:0 2 SRR28216602.1055944 393 chr1 20422 0 1S25M 0 0 CACACCTGGTTAGAAAACTGGGGCCA AAAAAEEEEEE<EEEEEEEEEEEEEE NH:i:9 HI:i:5 AS:i:24 nM:i:0

The error type is: system /data/TRUST4/bam-extractor -b /data/zhanqh/samtools-1.20/SRR_data/SRR28216602_Aligned.sortedByCoord.out.bam -t 1 -f hg38_bcrtcr.fa -o TRUST_SRR28216602_Aligned_toassemble failed: 136 at ./run-trust4 line 57. I apologize for the sudden disturbance and thank you for any response you have provided

mourisl commented 1 month ago

It seems the second BAM is not aligned on the human reference genome. I guess it is the BAM file from the cellranger vdj, and the alignment is against the assembled contigs instead of the reference genome. If you want to apply TRUST4 on this vdj data, starting from the fastq file probably would be easier.

origami974 commented 1 month ago

Thank you for your guidance, it will be very helpful for my learning. As you said, importing fastq files is indeed easy, but there is no content in the exported file. What went wrong? There were no errors during the operation of TRUST4, and it seems that my _1.fq and _2.fq files are also fine

mourisl commented 1 month ago

What was your running command?

origami974 commented 1 month ago

Thank you very much for your patient guidance. The command I am using is:/ run-trust4 -f hg38_bcrtcr.fa --ref human_IMGT+C.fa -1 mydata_1.fq -2 mydata_2.fq -o TRUST_mydata, Previously, fastq-dump was used to process mydata.sra, resulting in several files including mydata_1.fq and mydata_2. fq. Since the run-trust4 process is normal, I suspect there may be some issues with the fq file. After processing mydata.sra again, a single ended fq was obtained, Run the command again:/ run-trusts4 -f hg38_bcrtcr.fa --ref human_IMGT+C.fa -u mydata.fastq -o ..., The result is normal, it should be because my data preprocessing was not perfect enough. Now everything is normal. Thank you again for your reply

mourisl commented 1 month ago

Your data is 10X genomics data. You shall add the barcode information when running TRUST4, so it tries to assemble the receptor sequence for each cell. You can refer to this part of the readme: https://github.com/liulab-dfci/TRUST4?tab=readme-ov-file#10x-genomics-data-and-barcode-based-single-cell-data