ViennaRNA / forgi

An RNA manipulation library.
GNU General Public License v3.0
52 stars 31 forks source link

GraphConstructionError: Cannot set sequence. Illegal character in string when setting BulgeGraph sequence #31

Closed dmnfarrell closed 5 years ago

dmnfarrell commented 6 years ago

Hi

I formerly used the following code and it worked fine:

import forgi.graph.bulge_graph as cgb
seq='TTGTCAGTTAAAAAACCACCATGCGGGGAGTTCCCTGATGGCCTGGTGGTTAGGATTCGGTGCTCTCCCTGCCATGGC'
struct = '..................((((((((((((..(.(((((((((....)))))....)))).)..))))))).))))).'
bg = cgb.BulgeGraph()
bg.from_dotbracket(struct)
bg.seq = seq

However I now get this error when I try to set the sequence.

----> 7 bg.seq = seq

/usr/local/lib/python2.7/dist-packages/forgi/graph/bulge_graph.pyc in seq(self, value)
    499             yield self.seq_ids[i]
    500 
--> 501     def create_bulge_graph(self, stems, bulges):
    502         '''
    503         Find out which stems connect to which bulges

GraphConstructionError: Cannot set sequence. Illegal character in string 'TTGTCAGTTAAAAAACCACCATGCGGGGAGTTCCCTGATGGCCTGGTGGTTAGGATTCGGTGCTCTCCCTGCCATGGC'

This occurs regardless of whether the struct is provided and is the same in python 2 and 3. Is there another way to set the sequence?

dmnfarrell commented 6 years ago

Note that this does work in version 0.20

Bernhard10 commented 6 years ago

I guess this is due to the use of T instead of U (we model RNA). I guess I will change this to allow T again in version 2.0

dmnfarrell commented 6 years ago

Ok thanks. I reverted to version 0.20 to get the previous behaviour so it's not urgent.

joelwebb commented 5 years ago

Running Ubuntu Bionic on Windows with ViennaRNA 2.4.11 I am running into the same issue. Does 2.4.11 also allow use of T?

Bernhard10 commented 5 years ago

@joelwebb For questions about the Vienna RNA package (ViennaRNA 2.4.11), please ask at https://github.com/ViennaRNA/ViennaRNA But you can always try to replace the Ts with Us

As for the forgi 2.0 library, see below

Bernhard10 commented 5 years ago

On forgi 2.0, arbitrary characters are allowed for the sequence again. Bulge Graph creation has changed, so the code by @dmnfarrell would look like this:

import forgi.graph.bulge_graph as fgb
seq=     'TTGTCAGTTAAAAAACCACCATGCGGGGAGTTCCCTGATGGCCTGGTGGTTAGGATTCGGTGCTCTCCCTGCCATGGC'
struct = '..................((((((((((((..(.(((((((((....)))))....)))).)..))))))).))))).'
bg = fgb.BulgeGraph.from_dotbracket(struct, seq)
Bernhard10 commented 5 years ago

The Sequence class has a is_valid function that checks for 'AUCG', but this is not called by default