ViennaRNA / forgi

An RNA manipulation library.
GNU General Public License v3.0
52 stars 31 forks source link

Fix `get_position_in_element` #16

Closed Bernhard10 closed 7 years ago

Bernhard10 commented 8 years ago

From the tests:

    db = '(((((...))....)))'
    #     12345678901234567
    bg = fgb.BulgeGraph()
    bg.from_dotbracket(db)
    p,l = bg.get_position_in_element(1)
    self.assertEqual(p, 0)
    self.assertEqual(l, 2)
    p,l = bg.get_position_in_element(12)
    self.assertEqual(p, 2)
    self.assertEqual(l, 5)

But shouldn't it be:

    db = '(((((...))....)))'
    #     12345678901234567
    bg = fgb.BulgeGraph()
    bg.from_dotbracket(db)
    p,l = bg.get_position_in_element(1)
    self.assertEqual(p, 0)
    self.assertEqual(l, 3) #<---- The first stem has 3 closing parenthesis
    p,l = bg.get_position_in_element(12)
    self.assertEqual(p, 2)
    self.assertEqual(l, 4) #<---- The interior loop has only 4 nucleotides

assuming that the second return value is the length of the element.

pkerpedjiev commented 8 years ago

Yes, you're right. Those are bugs.

Bernhard10 commented 7 years ago

get_position_in_element is used by visualize_cg to localize a nucleotide along the axis of the cg-element. This is related (but different) to ernwin's aminor.

The current version returns (0-based index, len-1) for stems. It is intended to be used as a progress along the stem's vector. That makes sence: p/l==1 for the last nt of a stem and p/l==0 for the first, which corresponds to cg.coords[elem][1] and cg.coords[elem][0]

For loops, the first nt cannot lie directly on the cg-elements coords, because that is where the stem's nt is! Thus we return (1-based index, len+1). So for the first nt p/l>0 and for the last p/l<1, which is what we want.

So it was not a bug after all!