Closed tschmenger closed 1 year ago
Lets also add a static conservation barplot (currently dynamic based on shown sequences) to this.
I manages to do it both for the conservation part and the heatmap. re:conservation - I inverted what I had before, meaning that now the boxes go from bottom to top which I think makes it easier to immediately understand what it means. Its a minor thing that we could also easily revert again.
Attached you will find two files where I checked for MAP2K3 A84T and, based on this alignment, CDK8 K41 which is at the same alignment position. If you look carefully at the heatmap and the conservation you will notice that the respective positions show the same values in both plots, only thing that differs are the shown proteins which affects some of the columns shown.
Perhaps @gurdeep330 you can also have a look if you have time and close the issue if you think its okay ;)
Hey, you introduced a new parameter overallconservation
for the main
function. And that is why it throws an error. Beucase I still call the old main which didn't have this parameter.
I see that you get the value of overallcconservation from the command-line. So can you share that file also with me? or call just read it directly within the main function so that the parameters of the function are same as before.
Right! I added the dictionary to Vlatest.
Call it like so: python3 create_svg_20230505_kinases_GS.py P46734 84 30 11 humanKinasesTrimmed.clustal Mutational_Infofile_Kinases.txt Conservation_Dictionary_20230504.txt
Thanks @tschmenger , it works: http://activark.russelllab.org/ It needs some placement settings for the labels.
I have added a sample dic file for you here: Create_SVG/Vlatest/sample_dic_mutation_info.txt
Hopefully, this should give you an idea....
Okay. I saw that there is an error with BRAF V600E. Perhaps it is better to go back to the previous version of the script while I keep figuring things out.
OK, just make sure you see the same error when you run on your system via command-line. Sometimes errors can throw up because of something I wrote in my srcripts.
I think with this weeks updates we can close this issue.
@gurdeep330 told me that Rob "wants the ADR heatmap to NOT change i.e. it should be done for the entire alignment (>400 kinases) even if the user is looking at top 10 or 20, say"
The most efficient solution I could think of immediately would be to do the counts for the 'heatmap' separately and read it from a dictionary (using alignment positions rather than protein positions I think) rather than going through each letter of the whole alignment to sum up the values, each time we recalculate/redraw the alignment.