Open raybueno opened 5 years ago
@raybueno:
Without a reproducible example I'm not sure I understand what is your issue but if I do my guess is as follows: You have a fragment length distribution that make it (numerically) impossible to ever see the 3' end of a transcript with a read coming from the 5' of any fragment:
>##...#>
transcript
--...-
(3' most, shortest possible) fragment
>==...=>
(5', longest possible) read from that fragment
xx...x
'unreachable' part of the transcript
>#######################################################################>
------------------------------------
>=========>
xxxxxxxxxxxxxxxxxxxxxxxxx
Also, Did you read the paper? Did you read the documentation? What parts of those relevant for your issue are not clear enough? Maybe you could help to improve the documentation once you solve your problem?
@mschilli87:
Thank you for the reply and apologize for the lack of clarity. Here is of an example
Transcript:
GeneA1 ATCAGTCTCCGTGTGTGGATTTATGTCTACAGAGAGCATGGACGTTTTATGCTCACGTCACAACCTCCCGTTCTT
Read: @ReadGeneA1:1:75_1 ATCAGTCTCCGTGTGTGGATTTATGTCTACAGAGAGCATGGACGTTTTATGCTCACGTCACAACCTCCCGTTCTT
In this example, we have a 75bp Transcript and a 75bp Read. As you can see the Read should align with the reference as it is the same sequence. However when running Kallisto quant, with no --single-overhang the read does not pseudoalign. Interestingly if you add the --single-overhang option there is pseudoalignment. This can occur with any bp length for a reference gene. In the case of a Transcript that is 1500bp, if there is a 75bp read that starts at position 1425 and ends at position 1500, pseudoalignment will not occur unless the --single-overhang option is used.
I have read both the paper and the manual. I couldn't find any reason as to why Kallisto is unable to align reads that are positioned at the end of a Transcript. Thank you for your help.
Hello,
When trying to align reads to to the end of a gene, kallisto is unable to pseudoalign this read. However when using the --single-overhang option, pseudoalignment occurs.
My question is why is it that kallisto cannot pseudoalign sequences that are from the end of a gene and how does the --single-overhang option work?