ukaea / PROCESS

PROCESS is a systems code at UKAEA that calculates in a self-consistent manner the parameters of a fusion power plant with a specified performance, ensuring that its operating limits are not violated, and with the option to optimise to a given function of these parameters.
https://ukaea.github.io/PROCESS/
MIT License
34 stars 11 forks source link

Resolve "Python "module" variables in dicts" - [merged] #2463

Closed jonmaddock closed 1 year ago

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Nov 23, 2021, 15:21

Merges 1465-python-module-variables-in-dicts -> develop

Closes #1465

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Nov 24, 2021, 10:19

added 1 commit

Compare with previous version

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Nov 24, 2021, 16:20

added 4 commits

Compare with previous version

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Nov 25, 2021, 09:32

added 1 commit

Compare with previous version

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Nov 25, 2021, 09:44

added 1 commit

Compare with previous version

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Nov 30, 2021, 15:28

unmarked as a Work In Progress

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Nov 30, 2021, 15:30

@jonmaddock

We may want to discuss some aspects, e.g. the DocableVariable name tomorrow.

I also need to document the way the DocableVariable works and include some documentation including reasoning etc

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Dec 2, 2021, 12:15

added 27 commits

Compare with previous version

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Dec 2, 2021, 15:06

@jonmaddock

Please find the rendered documentation here: http://tnunn.gitpages.ccfe.ac.uk/process/development/annotated-variable/

jonmaddock commented 2 years ago

This is a good solution!

jonmaddock commented 2 years ago

Could you put docstrings and comments throughout please; good naming, but this is a confusing topic

jonmaddock commented 2 years ago

I'm guessing this is a converter between the DocableVariable()-produced object is and the object create_dicts.py would like?

jonmaddock commented 2 years ago

I guess this is because you can't access the _Variable class inside variables.DocableVariable(), so this is the only way to check if it's this class or not? Could you comment this please?

jonmaddock commented 2 years ago

So it returns a list of all the DocableVariables in the Python (converted to AnnotatedVariables) to create_dicts.py for adding into the dictionaries? Great!

jonmaddock commented 2 years ago

Can you update the docs just to add that the dicts now depend on the Python source too? This explains why there's a Python source dependency, and why they need to rebuild.

jonmaddock commented 2 years ago

First of all, great, and well remembered! Otherwise I guess our dicts would be out-of-date if there were any Python changes. Nice!

Why PREPROCESS_TARGET_NAMES instead of PROCESS_SRC_PATHS?

jonmaddock commented 2 years ago

Looks excellent!

jonmaddock commented 2 years ago

Ok, so as the dicts now rely on inspecting the process Python package, only create them dynamically, rather than at import to avoid a cyclic dependency. Can you make sure this is documented somewhere (you may already have done in a later commit, this is a reminder for me!)?

jonmaddock commented 2 years ago

Missing ()?

jonmaddock commented 2 years ago

I might be wrong, but if you install the process package before generating the dicts, won't this cause the python_fortran_dicts.json package data to be missing/not updated? The Python side of things will be fine as that's all generated dynamically, but the .json that's made in the DICTS_NAME target won't be in the installed package, I think. Could you put my mind at rest?

Perhaps only the Fortran dicts need to be created before installation, as they're the only ones that make up the python_fortran_dicts.json. The Python-dependent dicts could still be generated dynamically. Or perhaps I've got in a tangle!

jonmaddock commented 2 years ago

Isn't _object an instance of a phys/eng model class?

jonmaddock commented 2 years ago

So obj is an AnnotatedVariable, i.e. instance attribute, and _object is the class instance itself?

jonmaddock commented 2 years ago

Great docstrings here! Ignore earlier comment for this module.

jonmaddock commented 2 years ago

Will it? That's a bit of a drawback. The use of these AnnotatedVariables for the dicts is just a stepping stone really until the dicts are retired; the real use is going to be for holding info related to a physics/engineering var together throughout an entire Process run, and being used e.g. in writing output.

Why can this sort of thing be lost? How could this be prevented? I doubt this is a deal-breaker for this MR, but maybe we change AnnotatedVariable in the future to get around this limitation.

jonmaddock commented 2 years ago

Ah, I see! Firstly, great for documenting it! How strange; a float operation appears to "re-float-ise" it. How annoying. Stupid question: is type(x.cost) == float now?

jonmaddock commented 2 years ago

Excellent docs!

jonmaddock commented 2 years ago

I would change this slightly: these AnnotatedVariables are only for physics and engineering variables; things that have physical meaning (and most of the time, units). Not every instance variable in Python will be an AnnotatedVariable: small point, but maybe a name change to reflect this?

jonmaddock commented 2 years ago

The point of putting this _Variable class inside this wrapper function is to allow you to inherit from a variable class (e.g. float), right? This makes it difficult to compare the type of this class later in python_dicts.py, but is probably worth it. I guess the only alternative is to write out separate classes for each of the types you wish to inherit from; whilst more explicit, this upsets my DRY sensibilities. I can see arguments either way.

jonmaddock commented 2 years ago

There's a bit of tacit stuff going on here, but it's really well explained in your .md docs as well as here. It's my understanding that class attributes are usually reserved for sensible defaults; I think this is being slightly misused here by setting them to variables in the local scope, which are definitely unique to an instance rather than a default. Should these go in the __init__() as instance attributes?

If you override the __init__(), to me this makes the class behaviour much clearer. I believe you could then call super.__init__(*args, **kwargs) to make it explicit that you're calling the constructor of the parent class and with what arguments, rather than _Variable(*args, **{**kwargs, **__kwargs})'s arguments all tacitly going to the __init__() of tp.

This obviously achieves the same thing; this is purely opinionated style. I'm happy if you are!

jonmaddock commented 2 years ago

As usual Tim, top stuff! This is a really great solution that allows us to maintain the dicts even as we convert Fortran module variables into Python instance attributes. A few minor concerns in the comments, but altogether brilliant and great documentation, which is key for something as confusing as this!

The __doc__ and __units__ being lost sometimes bothers me a bit. It's definitely worth talking to James Cook in the Blueprint group about this; I was talking to him about it the other day and he mentioned they use something called wrapt (https://pypi.org/project/wrapt/). Perhaps decorators are a more Pythonic way of adding on (and persisting) attributes to pre-existing built-in types and numpy arrays? I'd talk to him; I believe Blueprint is already successfully doing exactly what we're after.

It may be worth getting this merged in with perhaps some minor changes now however; I don't want to creep the scope!

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 08:59

Commented on scripts/python_dicts.py line 9

It must be quite annoying when reviewing, but I often write docstrings and such in later commits. This is done in later commits and I will review and add some more on this reading.

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:04

Commented on scripts/python_dicts.py line 33

This is (somewhat oddly) commented above: I will also include relevant comments here.

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:10

Commented on cmake/ford.cmake line 43

Good point - ford does not just operate on wrapped sources

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:31

Commented on process/io/in_dat.py line 114

I have added these as notes to the header of the affected files

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:33

Commented on process/io/mfile.py line 484

@jonmaddock im a little confused what you mean here, are you referring to the fact that process_dicts is not a function? I have defined it a few lines above as a variable

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:35

Commented on CMakeLists.txt line 199

I have this here to indicate that the DICTS job should run. In a later commit to ford.cmake, install_process is a dependency of the dicts command.

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:44

Commented on documentation/proc-pages/development/annotated-variable.md line 61

@jonmaddock completely agree. Once we are ready to retire the dicts we will want to find a more permanent solution. I'm sure there is a lot of Python best-practices we can follow once we are doing things in a more Pythonic way

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:52

Commented on documentation/proc-pages/development/annotated-variable.md line 94

Yes it does - I have included this in the docs as I feel it may help get the point across

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:55

Commented on process/variables.py line 32

Absolutely - as discussed above, this is just to get the damn thing working. Once Process is more pythonic we can use less hacky best-practices to handle this stuff.

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 09:59

Commented on process/variables.py line 33

I opted to not do this in the constructor as I am not calling the constructor with these variables. By doing it this way, these are not instance attributes, but rather class attributes (as a normal doc string would be).

As you say, either way works but I believe keeping it as this reflects that I am not constructing the class with these attributes but am creating a class with these attributes that I am then constructing.

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 10:03

Agreed - this is a "messy" and "hacky" way of achieving the dicts as we want them. Once all of our variables are in Python we can look to use wrapt or something similar. If Blueprint is successfully doing this we should definitely talk to James once Process is in Python (as much as it can be).

Tim

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 10:07

Commented on scripts/python_dicts.py line 75

changed this line in version 6 of the diff

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 10:07

Commented on documentation/proc-pages/development/annotated-variable.md line 61

changed this line in version 6 of the diff

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 10:07

Commented on mkdocs.yml line 15

changed this line in version 6 of the diff

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 10:07

added 3 commits

Compare with previous version

jonmaddock commented 2 years ago

In GitLab by @timothy-nunn on Jan 19, 2022, 10:24

Commented on scripts/python_dicts.py line 82

obj is a member of _object which is the class instance itself. obj is later used as a pointer to the _Variable. Hopefully this is obvious now I have documented _object as the correct type.

jonmaddock commented 2 years ago

No problem Tim! I often review commit-by-commit chronologically, especially with more complicated stuff as I find I understand the author better (and it's less to hold in my head). I sometimes put reminders in and forget to close them if they're resolved by a subsequent commit. Either approach is fine with me!

jonmaddock commented 2 years ago

I've clearly lost the plot! You're right.

jonmaddock commented 2 years ago

Not quite sure I follow here, can you help me understand? You've removed DICTS_NAME as a dependency of install_process. In ford.cmake, DICTS_NAME now depends on install_process. So now the dicts will be created after the process package is installed. Doesn't that mean that you'll be missing the updated python_fortran_dicts.json in the installed process package, as it's only created after the installation?

jonmaddock commented 2 years ago

Well reasoned, I am happy! I still think that in the case of, say, float it is fine for its docstrings to be class attributes; they're obviously always the same and shared across all float instances. But in this case the docstrings are unique to the instance, and therefore would be suited to being instance attributes. Whilst all instances of _Variable have a docstring attribute, no two will have the same value.

Got me thinking, I am glad for the discussion and can agree to disagree on a minor style opinion! This is a clever solution to our problem.

jonmaddock commented 2 years ago

Nearly there! Thanks for your patience Tim.