MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.29k stars 646 forks source link

Deprecate `verbose` option in initialization of `AnalysisBase` #3651

Open PicoCentauri opened 2 years ago

PicoCentauri commented 2 years ago

Is your feature request related to a problem?

The verbose option for Analysis classes appears in the init function and as well in the run method. The first one seems to be not necessary for instance creation.

Describe the solution you'd like

Deprecate and remove verbose from the init method. I already have a suggestion in PR #3644

Describe alternatives you've considered

Leave everything as it is.

Additional context

Removing verbose from the init also would allow removing the **kwargs argument in many classes inheriting from the base class and closing #3431

richardjgowers commented 2 years ago

Makes more sense to remove from run() rather than __init__ imo

IAlibay commented 2 years ago

I disagree, verbosity is a direct attribute of run imho, plus sometimes I'll dry run with verbosity and re-run without.

PicoCentauri commented 2 years ago

I think there is a deeper question: We should decide what our idea behind the class methods are: Especially __init__ and _prepare.

Currently, we do 'things' in __init__ and inside _prepare which makes the purpose of the _prepare method a bit unclear. A new developer of an analysis class could ask: Why should I use the _prepare method if I can do everything inside init?

In my view, the __init__ method is only for storing the class-specific parsed arguments as class attributes and calls super() for the base class. For everything else like initialization of arrays, checks, etc. there exists the _prepare method. The latter is called within the run method and knows about the verbose option as well as the trajectory slicing. If we follow this idea it is consistent to put verbose into the run method.

richardjgowers commented 2 years ago

Yeah the lines between prepare and init are very blurred. I think one differentiator could be parallelism, which iirc does a prepare on each worker which then gets merged, whereas the structures from init are created once.

On Thu, Apr 21, 2022 at 10:01, Philip Loche @.***> wrote:

I think there is a deeper question: We should decide what our idea behind the class methods are: Especially init and _prepare.

Currently, we do 'things' in init and inside the _prepare which makes the purpose of _prepare method a bit unclear. A new developer of an analysis class could ask: Why should I use the _prepare method if I can do everything inside init?

In my view, the init method is only for storing the class-specific parsed arguments as class attributes and calls super() for the base class. For everything else like initialization of arrays, checks, etc. there exists the _prepare method. The latter is called within the run method and knows about the verbose option as well as the trajectory slicing. If we follow this idea it is consistent to put verbose into the run method.

— Reply to this email directly, view it on GitHub https://github.com/MDAnalysis/mdanalysis/issues/3651#issuecomment-1104912015, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGSGB7LXQ5QDYFY5N6NEZLVGEKPXANCNFSM5T4F2V5A . You are receiving this because you commented.Message ID: @.***>

PicoCentauri commented 2 years ago

Ahh parallelism.. I didn't thought about that. So __init__ should consist of main/global checks & initializations whereas prepare should only consists of temporal attributes and checks for each worker? This idea is not documented and parallelism is as well not part of the main library, right?

IAlibay commented 2 years ago

To be honest we need to rethink how we do verbosity completely with parallelism, iirc tqdm as used currently behaves badly. (there was an issue about this somewhere)

richardjgowers commented 2 years ago

Yeah it’s not actually well defined admittedly, this is my memory of how pmda works. I think one of the main goals of the structure of AnalysisBase is to make the parallelism easy one day.

On Thu, Apr 21, 2022 at 10:58, Philip Loche @.***> wrote:

Ahh parallelism.. I didn't thought about that. So init should consist of main/global checks & initializations whereas prepare should only consists of temporal attributes and checks for each worker? This idea is not documented and parallelism is as well not part of the main library, right?

— Reply to this email directly, view it on GitHub https://github.com/MDAnalysis/mdanalysis/issues/3651#issuecomment-1104989390, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGSGB3ZQUAI3624WZSQHELVGERD7ANCNFSM5T4F2V5A . You are receiving this because you commented.Message ID: @.***>

orbeckst commented 2 years ago

I generally agree with https://github.com/MDAnalysis/mdanalysis/issues/3651#issuecomment-1104912015 : For normal use cases, you should not have to write a lot of __init__; for some reason we structured AnalysisBase so that you have to almost always do argument wrangling to pass the trajectory to super().__init__ but everything else should be _prepare(). (EDIT: It would have been nice if creating a new analysis never had to actually do its own __init__/super() and only had to add _prepare() , _single_frame(), and _conclude().)

On the parallelism side, PMDA needs to be updated to make use of MDA 2.x stuff. At the moment it cannot directly use AnalysisBase because of the design decision in _single_frame() in PMDA to return values vs. modifying state in MDA. However, with the support for serializing universes, it might be possible to deliver on the promise to "magically add parallelism to existing analysis". PMDA was an attempt to do parallelism in a consistent fashion and it might still be useful if someone spends some time on it. However, I am not going to be upset if we come up with another way to do parallelism with AnalysisBase.

orbeckst commented 2 years ago

And to be explicit: I am also in favor of deprecating verbose in AnalysisBase.__init__.