interpret the results - Githubissues

Hello,

Thanks so much for developing this tool, I am disambiguating a pdx RNA sample, so I first ran STAR against both human and mouse, human I used the latest T2T reference, mouse I used the mm39 reference. I obtained two bam files, and the STAR reported stats for these two alignments are as below:

human:
                          Number of input reads |   50096054
                      Average input read length |   202
                                    UNIQUE READS:
                   Uniquely mapped reads number |   46423380
                        Uniquely mapped reads % |   92.67%
                          Average mapped length |   201.13

mouse:

                          Number of input reads |   50096054
                      Average input read length |   202
                                    UNIQUE READS:
                   Uniquely mapped reads number |   7249617
                        Uniquely mapped reads % |   14.47%
                          Average mapped length |   183.05

Then I ran disambiguate, and the summary is like that:

sample  unique species A pairs  unique species B pairs  ambiguous pairs
PPTC-COG-N-471x-R-human 50042123    48215230    7631

Does that mean, there are additional 40M reads being uniquely assigned to the mouse? Given the fact that only 7M was originally mapped to mouse before right? If that's case, is it suggesting that there are over 100M read pairs in total? But isn't the total read pair is 50M based on STAR output?

Thanks a lot in advance, Frank

AstraZeneca-NGS / disambiguate

interpret the results #20