mourisl / T1K

T1K is a versatile methods to genotype highly polymorphic genes (e.g. KIR, HLA) with bulk or single-cell RNA-seq, WGS or WES data.
MIT License
42 stars 7 forks source link

Found 0 read fragments when assigning stage #34

Closed punching-samuel closed 2 months ago

punching-samuel commented 2 months ago

Hello, I have used T1K to genotype our scRNA bam data with barcode white list. It runs fine before, but now it cannot find fragment when assigning stage, may owes to the format of my new data. The bam data from both times shared the same gtf and index files. But the latter data cannot assign fragments, they are truly from different sequencing platform.

My codes are like, 截屏2024-07-03 10 12 26

The first bam file is like,

截屏2024-06-27 10 50 58

The second bam file without any results after analyzing is like,

截屏2024-06-27 10 51 26

My errors are like, [Fri Jun 28 19:24:57 2024] run-t1k begins. [Fri Jun 28 19:24:57 2024] SYSTEM CALL:../T1K/bam-extractor -b ../BJZY0108L_name_sorted.bam -t 20 -f ..ref_genome/Kiridx_unzip_ver/_rna_coord.fa -o ../NoCB_BJZY0108L_candidate [Fri Jun 28 19:24:57 2024] Start to extract candidate reads from bam file. [Fri Jun 28 19:34:45 2024] Finish extracting reads. [Fri Jun 28 19:34:45 2024] SYSTEM CALL:../T1K/genotyper -o ../NoCB_BJZY0108L -t 20 -f ../ref_genome/Kiridx_unzip_ver/_rna_seq.fa -u ../NoCB_BJZY0108L_candidate.fq [Fri Jun 28 19:34:45 2024] Found 0 read fragments. Start read assignment. [Fri Jun 28 19:34:45 2024] Finish read end assignments. [Fri Jun 28 19:34:45 2024] Finish read fragment assignments. 0 read fragments can be assigned (average -nan alleles/read). [Fri Jun 28 19:34:45 2024] Finish allele quantification in 2 EM iterations. [Fri Jun 28 19:34:45 2024] Genotyping finishes. [Fri Jun 28 19:34:45 2024] SYSTEM CALL: ../T1K/analyzer -o ../NoCB_BJZY0108L -t 20 -f ../ref_genome/Kiridx_unzip_ver/_rna_seq.fa -a../NoCB_BJZY0108L_allele.tsv -u ../NoCB_BJZY0108L_aligned.fa [Fri Jun 28 19:34:45 2024] Found 0 read fragments. Start read assignment. [Fri Jun 28 19:34:45 2024] Finish read end assignments. [Fri Jun 28 19:34:45 2024] Finish read fragment assignments. 0 read fragments can be assigned (average -nan alleles/read). [Fri Jun 28 19:34:45 2024] Finish allele quantification in 2 EM iterations. [Fri Jun 28 19:34:45 2024] Post analysis finishes. [Fri Jun 28 19:34:45 2024] Finish.

punching-samuel commented 2 months ago

I can only guess that the difference in the bam files caused the same coord and reference to work, but the former worked, while the latter had no results. I use pysam to analyze and I can't see the difference in the structure of the bam files.

Best wishes, Samuel.

mourisl commented 2 months ago

The second bam file does not seems to be sorted by coordinate. Could you please check it? If it is not sorted, you can use samtools sort first.

punching-samuel commented 2 months ago

The second bam file does not seems to be sorted by coordinate. Could you please check it? If it is not sorted, you can use samtools sort first.

Thank you for your reply. I will try the suggestions now, although I originally thought that the sequencing platform had already sorted the bam when they were provided.

with regards, Samuel.

punching-samuel commented 2 months ago

The second bam file does not seems to be sorted by coordinate. Could you please check it? If it is not sorted, you can use samtools sort first.

It truly helps, thank you for your cooperation~