Open aya9aladdin opened 2 years ago
Thank you for opening this issue. Could you please:
Just to clarify. When you write:
Guessing is not an automatic process; by default, guessing is off unless the user chose to guess which property.
This is the ideal scenario but we cannot have that until version 3.0 as it is a breaking change. You do address this, but it is worth repeating.
Because no default guessing is a desirable, it would be good to have a way to request it. It would also be useful to prepare a transition with future versions of mdanalysis.
Different guessing methods will through warnings/error messages depending on the results it got, the output messages should precisely describe the universe updates with a warning about failed processes.
Where do you plan the output for successful guessing to be? Do we have anything comparable at the moment?
The mission of deciding how an attribute will be guessed will be carried out by the corresponding attribute guesser method related to each class. I think in this way the user doesn’t have to bother about how a guesser should work, and in the spirit of implementing a context-specific guesser, we have an abstraction power by having aware and smart guessers that know how exactly any attribute should be guessed for a specific environment (for example guessing mass for PDB is related to the element property, while for Martini is more related to bead type (atom type in MDAnalysis).
It might be really cool to have some dependency diagrams for this! Both for documentation, and for clarity in the project. For example, as you point out with PDB guessing, guessing the mass depends on knowing the element. In turn, guessing the element depends largely on knowing the atom name and residue name. Having this clarity gives us room for adding future guessers and features more easily. For example, with the PDB, it could be possible to guess the residue name from elements + bonds (which is likely out of scope for the current project). We could maybe look into decorators to make sure that the attribute dependencies are either already present or guessed first.
Edit: and the diagrams would look great in a blog post.
The mission of deciding how an attribute will be guessed will be carried out by the corresponding attribute guesser method related to each class. I think in this way the user doesn’t have to bother about how a guesser should work, and in the spirit of implementing a context-specific guesser, we have an abstraction power by having aware and smart guessers that know how exactly any attribute should be guessed for a specific environment (for example guessing mass for PDB is related to the element property, while for Martini is more related to bead type (atom type in MDAnalysis).
It might be really cool to have some dependency diagrams for this! Both for documentation, and for clarity in the project. For example, as you point out with PDB guessing, guessing the mass depends on knowing the element. In turn, guessing the element depends largely on knowing the atom name and residue name. Having this clarity gives us room for adding future guessers and features more easily. For example, with the PDB, it could be possible to guess the residue name from elements + bonds (which is likely out of scope for the current project). We could maybe look into decorators to make sure that the attribute dependencies are either already present or guessed first.
Edit: and the diagrams would look great in a blog post. the diagrams idea is really great, I will work on finding how to deliver it in the most descriptive way then adding it to my blog
Edit: and the diagrams would look great in a blog post. the diagrams idea is really great, I will work on finding how to deliver it in the most descriptive way then adding it to my blog
@aya9aladdin I use draw.io (now https://app.diagrams.net). It's a very nice tool with lots of cool options to design custom diagrams. I used it for general stuff and also used them for docs.
It might be really cool to have some dependency diagrams for this!
100%!
I always found the name "guesser" somewhat confusing, because some attributes are not technically guessed but they are read from a table once other (guessed) attributes are available. A diagram would clarify this confusion by showing what is "guessed" (i.e. inferred with some predefined rules, which might not be perfect) first, and what it is simply set based on the guessed attributes later on.
I think it would be great to color-code (or distinguish in other ways) which attributes are guessed and which are set on the diagram.
I have added a dependency diagrams to the issue for declaration, you can also see my project updates on my personal blog https://sites.google.com/pharma.asu.edu.eg/aya-gsoc/home
Thank you @aya9aladdin. Is your diagram reflecting what you plan on going, or is it what we currently have in the code?
Thank you @aya9aladdin. Is your diagram reflecting what you plan on going, or is it what we currently have in the code?
it's more of a reflection of what I'm going to build in both PDB and Martini guessers
Thank you @aya9aladdin. Is your diagram reflecting what you plan on going, or is it what we currently have in the code?
it's more of a reflection of what I'm going to build in both PDB and Martini guessers
In that case, could you do the same diagram but with the current state of the code. It is what we had in mind when asking for the diagram. Looking at this, will require you to find the various places where the guessing is done, and you may realise oddities. For instance, that some guessing functions take arguments which you will need to consider when writing your code.
Is your feature request related to a problem?
The current
Guesser
class used by the users or inside parsers hava a downside of being generic, which makes it doesn't fit all topologies and force fields . This result in various errors while guessing different atrributes (#2348 # 3218 #2331 #2265). That’s why we need a guesser class that is aware of the context of the universe of which it is a part.Describe the solution you'd like
I got inspired by @lilyminium (#2630), @jbarnoud , and @mnmelo (#598) proposals, so I mixed them with my vision to get the following methodology: • Guessing is not an automatic process; by default, guessing is off unless the user chose to guess which property. • Guesser should raise warning/messages when succeeding or failing in guessing a property (one single message for the whole guessed attribute). • Guessed properties should be easily modifiable. • Passing the context to the universe should be available at and after the initiation level. • Modifications and maintenance of guessers should be convenient.
Implementation:
BaseGuesser
class will be the parent class for each new context-specific guesser, it will hold no guessing methods; to make space for implementing customized child guessers, yet it will define the behavior by which child guessers should behave for organizing the guesser structure.universe
initiation level, the context will pass as an argument (either as a name or a guesser object), in addition, ato_guess
list will pass the desired attributes to be guessed.guesser
class could be called after the universe's creation with theguess_topology_attr()
API with the same spirit as the above implementation.If the user didn’t specify a context to the universe, a
DefaultGuesser
will be called, this guesser class will have the current generic guess methods, so we don't not break any existing code.The
universe
then will check the validity of the passed arguments and raise the appropriate errors accordingly.(white is given, orange is guessed)
Current default behavior
PDB
Martini