Open WenyuLiang opened 11 months ago
Hi Wenya,
I cannot check your code. The saturation curves require downsampling of the reads.
Thanks for your response. I'm wondering if yessubWLmatch_UniqueFeature corresponds to reads with valid CB, UB, gX in the bam file?
Yes, that's correct (CB/UB/GX).
That makes sense. However, I check the total number of reads whose CB!="-" && UB!="-" && gX!="-" from Aligned.sortedByCoord.out.bam and it's different from yessubWLmatch_UniqueFeature in Gene/Features.stats
That makes sense. However, I check the total number of reads whose CB!="-" && UB!="-" && gX!="-" from Aligned.sortedByCoord.out.bam and it's different from yessubWLmatch_UniqueFeature in Gene/Features.stats
Hi,Dear WenyuLiang, I have the same confusion when plot saturation. Do you find the good answer for the question? thank you very much.
Hi Alex,
Thanks for the great tool! I'd like to plot sequencing saturation like cellranger. I wonder if I can get the accumulating saturation from the bam file. according to this: https://github.com/alexdobin/STAR/issues/1887#issuecomment-1611354693 I wrote the following code. I use sS instead of CB UB and check sM tag like this because they are missing from bam file which maps to transcriptome. Can you see any bugs other than using sS instead of CB, UB, since the result is quite different from the correct one.
Best, Wenyu
while (sam_read1(in, header, record) >= 0) { uint8_t sS = bam_aux_get(record, "sS"); uint8_t gn = bam_aux_get(record, "GN"); uint8_t sm = bam_aux_get(record, "sM"); if (sm) { int sM = bam_aux2i(sm); if (sM < 0 || sM > 1) continue; } if (gn && *gn == 'Z') gene_name = bam_aux2Z(gn); else continue; if (gene_name == "-") continue; bool is_uniq = true; gene_umi_cb = gene_name + std::string(gene_umi_cb); auto result = uSeqs.insert(gene_umi_cb); is_uniq = result.second ? true : false; is_uniq_vec.emplace_back(is_uniq); }