Closed DTRademaker closed 1 year ago
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.
Hi Cunliang, I do not think we need an order parameter, right? I will accept this change.
Hi all, I’m not sure I would change the way get returns the values. The behavior described here was precisely what was intended. Since the issue is only in structure similarity I would add the sorting option there and not in get.
Agree with Nico. That's also why I suggested adding an option to get
itself.
Adding the order
argument to get is a solution but it makes get more complicated ... I would really simply fix that in StructureSimilarity and not touch get
. The get
method is central to all the data pipelines we have created (iscore, deeprank, deeprank GNN) so modifying it is not a trivial decision. Adding an order option would also make get do two things (SQL query + sorting) and that's 1 thing too many :)
Hi Nico, there seems no other way to fix this bug without changing the get function. We cannot fix StructureSimilarity downstream...
Hi Li, I'm not sure what the issue is, but the ordering can surely be done outside of the get function by simply getting the chainID, resSeq, name, x,y,z and ordering the resulting array according to the chain,res,name values. Or am i missing something ?
In any case changing the default behavior of get may break the data processing pipeline of all the codes that rely on it (iscore, deeprank, ... ). So I would be careful about it :)
Hi Li, I'm not sure what the issue is, but the ordering can surely be done outside of the get function by simply getting the chainID, resSeq, name, x,y,z and ordering the resulting array according to the chain,res,name values. Or am i missing something ?
In any case changing the default behavior of get may break the data processing pipeline of all the codes that rely on it (iscore, deeprank, ... ). So I would be careful about it :)
The 'get' function is called 12 times in the 'StructureSimilarity' script for retrieving xyz coordinates. These should all be adjusted then. Yes of course we can do it by retrieving 'chainID, resSeq, name, x,y,z', sorting it and extract the xyz and that times 12. But that would miss the point of having the get function in the first place where you specify what you need. Especially since this could also be done, as implemented by my suggestion, directly in the sql command of the 'get' function. Which I might add is computationally also faster. I understand the concern that it 'might' impact other software. But I could also argue that it 'might' actually improve the other software since the atom-order is currently completely dependent on the pdb-atom-order which can sometimes differ from pdb to pdb. And to be honest I can not imagine any scenario where having the residue atoms in a consistent order would disturb the other code... but that could of course also be my lack of imagination. The code I am suggesting passed all unit-tests, and should therefore also work fine for the other software right?
I didn't wrote the other code, so I can not be sure, but I would suggest to use the extra argument for the 'get' function if you really want to play it safe. But I think the inconsistency problem is a more general problem that should be solved by hardcoding it in the get function, resulting that the rest of the code should be tested and adapted if needed.
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.
Hey @DTRademaker @LilySnow @CunliangGeng, should we merge that ?
Hey @DTRademaker @LilySnow @CunliangGeng, should we merge that ?
Fine with me :)
@LilySnow if you also agree can you merge the PR ?
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.
Add sorting of the values to the 'get' function. This to fix a bug that sometimes occurs when comparing the atoms between multiple pdbs ( e.g. when using StructureSimilarity script). The bug is caused by pdbs having not the same atom-name (e.g. CA C CB O N...) order.