Open CHoeltermann opened 1 year ago
scTE identifies UMI through UR/UB tags. The bam file you provided seems to have no UMI information, so you need to set the -UMI parameter to False.
Sorry I forgot to mention!
Before analyzing with scTE, I used pysam (https://github.com/pysam-developers/pysam) to add a "UR:Z" flag (I basically copied RX:Z which is the UMI).
Does scTE have some other requirements for the BAM file or some dependency on a certain format I don't recognize?
These published bam files are supposed to be cellranger output, however I find that the bam file I downloaded and cellranger's specifications differ.
Thanks a lot & all the best, Charlotte
I have just noticed, the bam file you paste above has no chromosome and coordinate information
True, thank you for pointing out! I have reached out to the creators of the bam files to try and solve this :)
Hi & thanks for your pipeline;
I currently have the problem that scTE returns only the header and no counts.
I run the tool as follows:
scTE -i some_sorted.bam -o out -x hg38.exclusive.idx -p 10 --hdf5 False -CB CR -UMI UR
BED files are being generated and everything, and I get this output:
INFO : Calculating expression... 2023-06-02 11:54:35
INFO : Detect 0 cells expressed at least 200 genes, results output to out.csv
INFO : Finished calculating expression 2023-06-02 11:54:35
INFO : Done with 0d 0h 24m 19s
A sample of my BAM file:
| 1 | 4 | * | 0 | 255 | * | * | 0 | 98 | NAAAGAAACAGCAAGAAGGATACGAATCAACAGACAAACACTGCGGCACAACGCATCAAAGAGGCGAGGGCCTTCCGGAGGACGAGGACAGAGTCTCC | !--7-7<----7A--7A--7---7-------<---<-<-A---7---7-F<--7--7A7-A---7-7-----7--777----7--7-7----7----7 | on:Z:HF1_25773:2:1101:1550:1367@1:N:0:NAAACCCT | op:Z:!--7-7<----7A--7A--7---7-------<---<-<-A---7---7-F<--7--7A7-A---7-7-----7--777----7--7-7----7----7 | RX:Z:TTGGGCGGTG | QX:Z:JAJJJJJJ<J | CR:Z:TNTGAGAAGACTGGGT | CY:Z:A!AFFJFJJJAAJJJJ | | 1 | 4 | * | 0 | 255 | * | * | 0 | 98 | NCTGGCATTGCCCACAACGACCACTATGTCAAGCTCATTTCCTGGTCAGACAACGAAAATGACGACAGCAACAGGATGGAGGACGTCGTGGCCCACAA | !7-7-7-7A------7AAJJ--77--77--FJFF-7<7-<JA7<-7---7-<FAAFA--77-<-<J--7AAAFF<--7--<-777J--7-7-77AF<- | on:Z:HF1_25773:3:1101:1306:1367@1:N:0:NCCGTATG | op:Z:!7-7-7-7A------7AAJJ--77--77--FJFF-7<7-<JA7<-7---7-<FAAFA--77-<-<J--7AAAFF<--7--<-777J--7-7-77AF<- | RX:Z:TATACGCAGT | QX:Z:A<AA-FJ-7A | CR:Z:ANGGGAGGTTCGGGCT | CY:Z:<!AAFJJJFF<JJJJF | | 1 | 4 | * | 0 | 255 | * | * | 0 | 98 | NTTAGCTGGTCAGAATGGTGCACACCAGTGGTCACAGCTAGCCGAGAGGCTGAGATGAGAGGATCGCTGGAGCGTAGGAGGGTGATGGTGCAGAGAGC | !7--7A<----7--<----77FAJ<F-<-7A<7-7A-77A-777-7-AAF-77A-AF-77A77-A-A-777-7----7--A-A<--7<-777A-7A-- | on:Z:HF1_25773:4:1101:29701:1332@1:N:0:NCTCAGTG | op:Z:!7--7A<----7--<----77FAJ<F-<-7A<7-7A-77A-777-7-AAF-77A-AF-77A77-A-A-777-7----7--A-A<--7<-777A-7A-- | RX:Z:GATTCCGTTG | QX:Z:JJFJ<7JFJ- | CR:Z:ANGGGTCGTGACGCCT | CY:Z:A!A<--FAF7AF<JFJ |
I don't know what's going wrong here, but since the data I am using is a published dataset where counts should be plenty, I assume I am making some sort of mistake.
Cheers, Charlotte