mochell / PiCLES

Particle-In-CelL for Efficient Swell - An efficient surface wave model for Earth System Models
Apache License 2.0
8 stars 0 forks source link

Introduce namelists #32

Open mochell opened 2 months ago

mochell commented 2 months ago

Many parameters are currently hard-coded in the run file or the object definition. parameters should be set through namelists.yml and then changed if needed

glwagner commented 2 months ago

True, although I strongly recommend implementing a programmatic interface like Oceananigans has!

mochell commented 2 months ago

I skimmed through Oceananigans and through ClimaAtmos. I could really find how you manage your parameters. can you point me to how its done?

glwagner commented 2 months ago

Parameters are managed locally by the objects they pertain to, for example

struct FPlane{FT} <: AbstractRotation
    f :: FT
end

represents Coriolis forces on an f-plane with Coriolis parameter f. This is the source code:

https://github.com/CliMA/Oceananigans.jl/blob/main/src/Coriolis/f_plane.jl

The user must supply f in their script:

https://github.com/CliMA/Oceananigans.jl/blob/553b14c23120c10540c7e8e7d3feb64b0e48cff4/examples/langmuir_turbulence.jl#L116

In other words, there is no designated "parameter file" --- all parameters that are invoked in a numerical experiment should be available in the user's script.

There is a piece of software that can be used to manage parameters in a more "global" sense:

https://github.com/CliMA/ClimaParams.jl

However I think it makes sense to use ClimaParams to extract and set parameters within a user-type script rather than to bake it directly into a model constructor.

I built up an example of what it might look like to use ClimaParams along with Oceananigans code to provide an interface for building models in a more "namelist-y" way. Note, we still don't use namelists, julia scripts are always better than non-code.

https://github.com/CliMA/ClimaParams.jl/tree/main/examples/OceananigansSingleColumnSimulation

glwagner commented 2 months ago

This is a different user interface paradigm than traditional software.

mochell commented 2 months ago

Yes indeed it is a little different. Thanks for providing the minimal example.

So what I take from this is:

I like that. The problem for just the development stage remains that if you want to change default parameters throughout all test and benchmarks (i.e. globally within the model package) they will be hard-coded in all files; or all files are feeding from the same namelist. So its just the question how to deal with that when the defaults parameters in functions like single_column_simulation() are not clear yet.

glwagner commented 2 months ago

We do provide some defaults for convenience, eg

https://github.com/CliMA/Oceananigans.jl/blob/ba11d2bd3ffe2bff72c3f3514c2f3c53fd48ae16/src/BuoyancyModels/seawater_buoyancy.jl#L107

I don't totally understand what you mean by needing to change all parameters globally in a development stage. Development should generally proceed along these lines:

  1. Define the user interface
  2. Implement functionality
  3. Implement tests of the functionality, using the user interface

It's crucial to actually invoke the user interface in the tests. The user interface is not just "sugar" on top of the package, its actually the end goal itself. The internal software design should be, in some sense, subservient to the user interface. The tests must use the user interface, because one of the hardest challenges of long term software is the maintainence burden imposed by testing. The goal of the user interface is to make it as easy as possible to invoke the functionality of the package, and one of its most important applications is in the writing of tests.

Perhaps more broadly there shouldn't be any challenge in running tests with different parameter values if parameters are not hardcoded (but are rather passed into functions / types when they are constructed), right?

glwagner commented 2 months ago

If you would like to implement a test and use it with multiple parameter values then you can

  1. Write a function that builds the test "simulation" or model, which takes parameters as input
  2. Invoke that function with different parameter values for two different tests.
glwagner commented 2 months ago

Here's an example of a function that tests whether time-stepping runs without an error, taking in the "turbulence closure" as an argument:

https://github.com/CliMA/Oceananigans.jl/blob/ba11d2bd3ffe2bff72c3f3514c2f3c53fd48ae16/test/test_time_stepping.jl#L40

mochell commented 2 months ago

https://github.com/CliMA/Oceananigans.jl/blob/ba11d2bd3ffe2bff72c3f3514c2f3c53fd48ae16/src/BuoyancyModels/seawater_buoyancy.jl#L107

so there are some constants like 'g_Earth' that are defined in the module namespace. https://github.com/CliMA/Oceananigans.jl/blob/ba11d2bd3ffe2bff72c3f3514c2f3c53fd48ae16/src/BuoyancyModels/BuoyancyModels.jl#19 seams to be hard coded in the BouyancyModel.jl module ;) I may have a handful of those. I mean one could provide a .yml with such basic parameters in the module itself, but then one had to decide what is basic and what not. It goes back to how much convenience you want for the user to change things or to break things..

  1. Define the user interface
    1. Implement functionality
    2. Implement tests of the functionality, using the user interface

Well my user interface wasn't completely defined, but I stared testing of course. I just have to rewrite them. This will remain a problem since the model will grow in complexity with time.

glwagner commented 2 months ago

so there are some constants like 'g_Earth' that are defined in the module namespace. https://github.com/CliMA/Oceananigans.jl/blob/ba11d2bd3ffe2bff72c3f3514c2f3c53fd48ae16/src/BuoyancyModels/BuoyancyModels.jl#19 seams to be hard coded in the BouyancyModel.jl module ;)

Not sure I deserve to endure such a savage wink... I disagree that the constant is hard coded. It is only the default value that is hard coded. Any place where gravitational acceleration is used, it is drawn from the type and not the constant.

If a user needs to change this value, they don't change source code. They have to provide a new value when they build the type in question (here SeawaterBuoyancy)

Please let me know that you understand this point without winking...

I mean one could provide a .yml with such basic parameters in the module itself

I strongly suggest not doing this

mochell commented 2 months ago

fine, I go your point. I just like savage winks. The separation between what is a default variable in the code and what is provided through the user interface remains fuzzy. The user can change both, it just depends on what convenience one wants to provide.

mochell commented 2 months ago

Thanks for clarifying, that was very helpful!

glwagner commented 2 months ago

Yes agreed there is fuzziness there. Our approach is that Oceananigans is fundamentally a library for building models rather than a model itself. Therefore, it functionally should have no defaults.

People who provide models -- like the Clima Params example, or ClimaOcean, are the people who have to introduce defaults. Those actors introduce an additional interface like the one in the example.

I think for this code you should think of it like Oceananigans, as a library. In that sense there should not be defaults. Develop a user interface where specifying parameters is effortless. Then, you don't have to worry about the "extra work" of writing parameters -- it's easy enough. Later, you will implement an actual model using this code. When you get to that, develop a layer on top of this library that provides default parameters.

glwagner commented 2 months ago

This approach has the huge advantage that embedding or linking this code with an Oceananigans model will be super easy because the interface will be ready for it. If you require users to fiddle with a yml file in order to use th code it's going to make things annoying down the road. Imagine if all codes did that -- we'd end up with a jungle of yml.

mochell commented 2 months ago

Okey, that is indeed very helpful. So the applied wave model should be its own library seeds from PiCLES & Oceananigans. PiCLES it self depends on some Oceananigans stuff. It will be some work to enabling PiCLES on the different grid types, but we can worry about that later ..

glwagner commented 2 months ago

What are the parameters that you need to specify?

mochell commented 2 months ago

I think there are about 3 groups:

most of then are at the interface level already, or defaults when the struct is initialized

https://github.com/mochell/PiCLES/blob/734cf9053941a93461cf2c33f0cf1deda0bfc8fb/src/Models/WaveGrowthModels2D.jl#L124-L138

https://github.com/mochell/PiCLES/blob/734cf9053941a93461cf2c33f0cf1deda0bfc8fb/src/ParticleSystems/particle_waves_v5.jl#L34-L50

glwagner commented 2 months ago

Parameters for the WaveModel equations (fetch constants, gravity.. )

Have you designed types / structs to represent the different terms in the model equations? Those structs would be the right place to store those parameters. Examples in Oceananigns are FPlane(f=1e-4) (representing the Coriolis term formulated on the f-plane) and ScalarDiffusivity(ν=1e-2) (representing diffusion / viscosity with a scalar coefficient).

Parameters that control ODE integration

We call this the "time-stepper", and dedicate a struct for that.

Parameters that change the behavior of the model (initialization, threshold, limits)

Initialization should use a design like Oceananigans with its set! function. Not sure what you mean by threshold and limits

glwagner commented 2 months ago

Try not to group too many parameters in structs with generic names --- use small, specific structs with informative names. For example wind_min_squared is not an appropriate parameter for a struct called ODESettings, because it doesn't have to do with timestepping exactly.

You may also want to use callbacks to implement some features, rather than baking them into the time integration. That will allow them to be changed more easily by users.

mochell commented 2 months ago

thanks. there are inconsistencies, but I knew I'll have to re-order them at some point, that's why this issue.

I don't really understand why we need to have a struct for a single variable like FPlane(f=1e-4). I've seen this in several julia codes that basically wrap a struct around one thing. It feels like introducing extra layers. Does this go back to have custom types and then multiple dispatch from that?

mochell commented 2 months ago

ah, If I have a struct for the variable I can store the default parameter there rather than at the initialization of the model. It gets more modular.

glwagner commented 2 months ago

Any equation of motion has one Coriolis term, model.coriolis. The user determines what that teerm is --- is it FPlane? or BetaPlane. We also suppose things like NonTraditionalFPlane and even HydrostaticSphericalCoriolis which represents Coriolis forces on the sphere, making the hydrostatic (traditional) approximation.

For BetaPlane, there are two parameters: f0 and beta. For FPlane, there is only one parameter, f.

By designing the model this way, coriolis only has exactly the parameters that are actually being used in the simulation.

Internally, those types are used via multiple dispatch to add the correct contribution to the tendency of the equation of motion. For example:

https://github.com/CliMA/Oceananigans.jl/blob/29c8115a2e3edbad89c2fbdc436d5afc331a9bb2/src/Models/NonhydrostaticModels/nonhydrostatic_tendency_kernel_functions.jl#L47-L81

glwagner commented 2 months ago

It might be a good idea to become familiar with the Oceananigans user interface and try to set up a few problems using it. That will help you grok the philosophy behind the design --- the philosophy is non-trivial and super important.