Open TobiasPol opened 1 year ago
Hi @TobiasPol
Any luck? I would be interested in this as well.
Hey @celalp, not yet unfortunately. Do you have any idea for the solution or do you know if it is even possible?
I'm not so sure if we can with certanity, I have a tiny pdb that I've been using to understand the code in greater depth, seems like the reduction in the number of vertices from the original msms output happens during the fixmesh function. There is a whole bunch of joining and collapsing happening during that. This seems to remove the vertex information and create new vertices and faces and there is no tracking of what's going on not just in the code but also from the point of view of pymesh, (pymesh dont care).
There are couple of things I'm looking into now,
1) in compute_charges we use the 4 closest charges from the original vertices, this is done using the kdtree algorithm, seems a bit backwards in the sense that we had this information, we lost it and now we are approximating it because we actually need it. Now can we assume that the closest 4 vertices are the actual orginial vertices before fixmesh, I don't think so, not all the time (see below).
2) doing something similar to compute charges and getting the coords myself, this is however done in euclidean space not geodesic if I'm not mistaken so same issues above, There is also no guarantee that we will get the correct number, for charges it may not matter because it's a simplification.
3) I looked into if i can map some of the vertices from the original mesh, and the answer is kinda no again for the reasons for fixmexh.
4) There are some functions to get patches that are not used in the calculations but either are for pymol or for debugging/sanity checks, if we can get the coords of the patch we can get the closest atoms and therefore aas, but this is kinda the same things as 2 and is not exact also the number of vertices and faces in each of the old/new mesh are not the same, for visualization this does not matter, for figuring out the aa it might, i don't really know. Probably depends on how steep the curvature of the surface is. As long as geodesic is close to euclidean should be pretty much the same.
I think I'm going to go with 2 and just say hey, these are the closest ones, we are dealing with atom coordinates and Ca's of aas pretty far from one another and I'm hoping that it would be right or very close to right.
To clarify this needs to be done on the vertex coordinates before you convert them to polar, I'm thinking about writing couple of functions that does the above calculations and saves them in a class of sorts and for I/O think and hdf5 per protein/chain while a bit expensive might be the simplest to implement.
This is my interpretation of the code and considering the spare commenting dont take this as "legal advice" and if you have any ideas I would love to hear.
Honestly, all i want to do is compare protein surfaces at the moment, I just want to see if patch x from protein a is similar to patch y from protein b and then get the aas involved. When I first started I was so sure that I would find something simple enough but I guess coordinate and rotation invariance requirement makes it challenging.
I'm hesitant to use learned features because I will not be working with human/mouse/even animal proteins for this specific project.
Sorry for the wall of text and rant.
Thank you for your detailed answer! That helps me a lot. I would prefer to have an output containing the two comparative patches with the coordinates and the respective residue numbers. Something like this: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
query | match | match_cen | match_patch | query_cen | query_patch -- | -- | -- | -- | -- | -- 7mjr_A | 7ml9_A_152 | [-7.37 76.34 82.97] | {'34', '48', '47', '35', '49', '37', '36', '51'} | [ 9.6537145 95.1163015 74.047997 ] | {'504', '512', '518', '519', '520', '503', '516', '517', '506', '505'}
I am currently working with MaSIF-search and am looking for a way to extract the residual numbers of a patch to validate the results. Does anyone know if this is possible?