russelllab / kinaseResistance

A method to predict activating, deactivating and resistance mutations in kinases
http://activark.russelllab.org
GNU General Public License v3.0
1 stars 0 forks source link

Highlight domain regions on the alignment #52

Closed gurdeep330 closed 1 year ago

gurdeep330 commented 1 year ago

@tschmenger

We have to highlight the domain region on the alignment, remember? Like the "features" thing you used to show before. How would you like the input dictionary to be? I will prepare it that way.

tschmenger commented 1 year ago

@gurdeep330 I think the last time we tried to do this we used a file like this:

Create_SVG/Version_K(inases)/Features_Infofile.txt

The dictionary then looked like this: {'p-loop': ['70', '78'], 'activation-loop': ['208', '234'], 'HrD-motif': ['188', '190']}

This was many weeks ago, I think the positions here probably refer to the alignment position. Would that work? Otherwise we could make a dictionary similar to the one that holds the positional information.

gurdeep330 commented 1 year ago

@tschmenger

Here is the file: data/ss.tsv The first column contains the name of the region, the second column contains the start-end of the corresponding region in the Pkinase alignment from Pfam. The third column contains the start-end of the corresponding region in OUR alignment.

The 1st and 3rd columns are of your interest. I would HIGHLY recommend that you read this file on the fly. This is so because this file may be changed several times in the next few days. Reading directly on the fly from this file would mean that I just have to update this file and the changes will then automatically be reflected also in the alignment.

gurdeep330 commented 1 year ago

@tschmenger

PS: THIS CAN WAIT UNTIL YOU ARE BACK

Thanks! It works fine. But..... there are overlaps :-) Check this out. This is because the DFG motif lies in the A-loop, so there will be 2 annotations for the same region. Could you fix this? There can be more overlap elsewhere z.B. here - so maybe a general approach would be better?

Another note, if you look at the ss.tsv file, there are now Greek letters besides alphanumeric characters. Python (sometimes) has trouble reading them. You can circumvent that by specifying the encoding. I did that already for activark script and also in the latest create_svg script . Just wanted to give you a heads-up...

Thanks

tschmenger commented 1 year ago

Hey @gurdeep330 , please see the create_svg script and the images according to your examples above.

MAP2K1 RAF1

These should work better?