MRC-CSO-SPHSU / LoneParentsModel.jl

An initial implementation of an ABM for social and child care
0 stars 4 forks source link

Model within mainHelper.jl #146

Closed AtiyahElsheikh closed 1 year ago

AtiyahElsheikh commented 1 year ago

propose to place it within a module.

AtiyahElsheikh commented 1 year ago

I also don't see the point of adding the various accessor functions (e.g. population, data, etc.). Model as well as Parameters or Data (not convinced about that one either) are flat, primitive data types (if structs weren't more convenient we could just as well use named tuples there), that are created ad-hoc for this particular model.

What we do need a solution for is how to organise the various global bits of the model, i.e. parameters, input data and population and world state. The most pragmatic solution would be to drop all of them into the Model struct and be done with it. It feels a bit wrong, I think, but maybe we should just do that and change it when we come up with a nicer solution.

The point for me was to enable me to access the same interface with different implementation of accessory functions. Though I am not really keen to use the API with model as an argument.

mhinsch commented 1 year ago

propose to place it within a module.

Why?

mhinsch commented 1 year ago

I also don't see the point of adding the various accessor functions (e.g. population, data, etc.). Model as well as Parameters or Data (not convinced about that one either) are flat, primitive data types (if structs weren't more convenient we could just as well use named tuples there), that are created ad-hoc for this particular model. What we do need a solution for is how to organise the various global bits of the model, i.e. parameters, input data and population and world state. The most pragmatic solution would be to drop all of them into the Model struct and be done with it. It feels a bit wrong, I think, but maybe we should just do that and change it when we come up with a nicer solution.

The point for me was to enable me to access the same interface with different implementation of accessory functions. Though I am not really keen to use the API with model as an argument.

But that simply adds a level of indirection. Instead of defining our (internal) API as e.g. providing functions getPopulation, getDemoData, etc., we can just as well define it as the existence of an object (of undefined type!) that provides population, demoData, etc. as components. Since Julia allows us not to commit to a specific type this would be just as generic as requiring accessor functions. And even with accessor functions you would still have to know which object contains your components (population, demoData, etc.).

I'm not against accessor functions (or modules or separate types) per se, but I think any added complexity of the code needs a concrete justification (otherwise we end up in the realm of cargo cult software engineering (see a lot of Java code for some really bad cases of that...)).

AtiyahElsheikh commented 1 year ago

propose to place it within a module.

Why?

mainHelpers.jl has now couple of unrelated functionalities including I/O , model construction and configuration , etc. If it is already suggested that model-related data to be placed in a module, and since model parameters are placed in a module, then definitely, the model itself need to be logically placed in a module.

Also think about the need to have couple of models (corresponding to different case studies, perspectives ,etc.)

mhinsch commented 1 year ago

propose to place it within a module.

Why?

mainHelpers.jl has now couple of unrelated functionalities including I/O , model construction and configuration , etc.

Yes, it's a bit of a collection ATM, so it might make sense to split it. But again, the question is - what do we gain by doing that?

If it is already suggested that model-related data to be placed in a module, and since model parameters are placed in a module, then definitely, the model itself need to be logically placed in a module.

I would actually rather remove the parameter module than put data and model into one. As I see it, the point of modules is to hide away internal functionality that is never needed outside of the module and thus present a simplified interface to some (more) complex functionality to the world and prevent accidental tinkering with internals by the user. In addition modules also provide namespaces (and therefore prevent name clashes).

None of these are needed for parameters, data or model.

Also think about the need to have couple of models (corresponding to different case studies, perspectives ,etc.)

To be honest, I think a cleaner, safer and more usable solution in that case would be to either copy the entire code base or have separate directories with their own mode, etc for each model.

I know that all of this goes against all software engineering instincts one usually acquires as a programmer and it took me quite a while to arrive at this point as well, but in my experience treating the semantic part of a simulation like regular code is not helpful. A simulation model is not a program. It's an engine that happens to be configured in code.

AtiyahElsheikh commented 1 year ago

To my understanding from the book Design Patterns and Best practices with Julia, Tom Kwong, the creation of submodules is not only for avoiding name clashes or hiding internal functionalities but it is also a way to structure complex code and divide large modules into smaller ones. Of course there is no enforcement in following a certain book or design principles etc.

Practically, I can say that since the removal of explicit using statements as well as stuffs not placed within submodules and I do suffer finding the right locations / files of some functionalities when I need to access / overview their implementation. Modules and submodules were a way to simplify this process especially when 1-1 naming conventions equivalent to the directory structures are used. Placing every thing in a file or at top level in a flat manner is not good for someone who wants to use these functionalities to develop something or extend them when needed.

mhinsch commented 1 year ago

To my understanding from the book Design Patterns and Best practices with Julia, Tom Kwong, the creation of submodules is not only for avoiding name clashes or hiding internal functionalities but it is also a way to structure complex code and divide large modules into smaller ones. Of course there is no enforcement in following a certain book or design principles etc.

I see the point, but I would argue that files and directories are completely sufficient for that. As long as the other functions of modules that I mentioned, are not required, adding modules provides no additional value in terms of code structuring (I sometimes suspect that people who claim that it does are subconsciously influenced by languages like Java that have a 1:1 correspondence between files and language constructs).

Practically, I can say that since the removal of explicit using statements as well as stuffs not placed within submodules and I do suffer finding the right locations / files of some functionalities when I need to access / overview their implementation. Modules and submodules were a way to simplify this process especially when 1-1 naming conventions equivalent to the directory structures are used. Placing every thing in a file or at top level in a flat manner is not good for someone who wants to use these functionalities to develop something or extend them when needed.

I agree that explicit using statements are very helpful in that respect (but not enough so that it balances their annoyances...). But without them I don't see how modules help, to be honest. For me, whether I have to remember that function xyz is defined in module ABC or in file abc.jl really doesn't make a difference. The most important thing here is to keep stuff structured logically, I think, so that for any given type or function it's reasonably obvious where it has to be defined. Admittedly the situation is not optimal at the moment with all the various util, helper, handle etc. files.

On the other hand I would argue that the "deep" structure we have at the moment makes it unnecessarily hard to understand which part of the code depends on which other part of the code. For example, in order to find out if main.jl uses dependencies.jl someone unfamiliar with the code base would have to look at files in this sequence: main.jl, lpm.jl, mainHelpers.jl, src/LPM.jl, src/lpm/Demography.jl, src/lpm/demography/Simulate.jl.

AtiyahElsheikh commented 1 year ago

adding modules provides no additional value in terms of code structuring (I sometimes suspect that people who claim that it does are subconsciously influenced by languages like Java that have a 1:1 correspondence between files and language constructs).

I did not work with Java since 2008.

At the moment, and it will be increasingly the case, file and function names are not sufficient to locate where is what. Stuffs that I managed to identify their locations today, in one week from now I will again use linux commands etc to find where they are and consume time overhead for that.

On the other hand I would argue that the "deep" structure we have at the moment makes it unnecessarily hard to understand which part of the code depends on which other part of the code.

This speaks for explicit using statements.

And if it is annoying due to the time it takes for pre-compilation, making use of Revise or other similar technologies would help not to wait for the pre-compilation phase.

mhinsch commented 1 year ago

Ok, to recap, there are two somewhat independent issues we have been discussing:

finding the place of definition for a function or type

things that would help:

finding out which pieces of code a given file/module depends on

things that would help:

As I said before, in my opinion explicit imports have too serious practical and conceptual issues to be useful (mainly they make refactoring a PITA and they break module encapsulation). Without that modules don't contribute to the solution of the first problem. I admit that for dependency tracking (problem 2) "everything is a module" would be a solution, but I would still prefer a more lightweight approach, especially since even with "full modularisation" a deep nested hierarchy is still quite annoying to work with.

AtiyahElsheikh commented 1 year ago

Proposed solution

Having worked a bit now on the calibration project, I like the module LPMLib which provides me a way to structure stuffs (though this not something that I have now invested time for). This should not be restrictive as it is just a view (which can be replicated, improved and modified etc.). I believe there is a way to combine both worlds (less hierarchies of modules and no explicit using statements). But still maximal reuse of common code needs some flexibility in allowing separation of different entities in the same source file (the current case with the Model definition). If there is a restriction to do this, it is easier to address why.

AtiyahElsheikh commented 1 year ago

Also forgot to say it is well-integrated with github

On Sat, 12 Nov 2022, 20:38 atiyah.elsheikh, @.***> wrote:

-

VSCode is very good in automatic code completion ( and thus having no obstacle in clear variable / function names). Also one can recognise the same identifier in the same source file). I am not sure if this would be possible to identify where implementation of a function is etc but in this case I use grep. I did not try this before because I really prefer REPL. But I tried to see if vscode can run / debug the code) but apparently not the case. I did not try JUNO either.

Having fewer and longer source files (I understand your perspective and I see some of the constraints why it is so) and I usually myself tends not to separate files till

  • I feel there is a necessity to (including source files gets bigger and bigger and clear patterns of different stuffs start to emerge).
    • Or there is a need to re-use somewhere else
  • But, I believe this would make things more difficult in certain situations concerning usability (remember that we want things to converge to a usable library). At the moment this makes usability more difficult as well as other situations (so I am currently for instance using your Model definition which needs accessory functions to be used here and there.

Proposed solution

Having worked a bit now on the calibration project, I like the module LPMLib which provides me a way to structure stuffs (though this not something that I have now invested time for). This should not be restrictive as it is just a view (which can be replicated, improved and modified etc.). I believe there is a way to combine both worlds (less hierarchies of modules and no explicit using statements). But still maximal reuse of common code needs some flexibility in allowing separation of different entities in the same source file (the current case with the Model definition). If there is a restriction to do this, it is easier to address why.

— Reply to this email directly, view it on GitHub https://github.com/MRC-CSO-SPHSU/LoneParentsModel.jl/issues/146#issuecomment-1312557454, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFN4A3FIYKTSNUPVHNY3TTWH7WZ3ANCNFSM6AAAAAARS2VZD4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mhinsch commented 1 year ago

Also forgot to say it is well-integrated with github

What is?

mhinsch commented 1 year ago

Proposed solution

Having worked a bit now on the calibration project, I like the module LPMLib which provides me a way to structure stuffs (though this not something that I have now invested time for). This should not be restrictive as it is just a view (which can be replicated, improved and modified etc.). I believe there is a way to combine both worlds (less hierarchies of modules and no explicit using statements). But still maximal reuse of common code needs some flexibility in allowing separation of different entities in the same source file (the current case with the Model definition). If there is a restriction to do this, it is easier to address why.

I didn't understand this, can you elaborate on your proposed solution?

AtiyahElsheikh commented 1 year ago

module LPMLib here

mhinsch commented 1 year ago

module LPMLib here

Yes, I have seen LPMlib in your calibration repo, but I still don't understand how that relates to the matter at hand and your proposed solution.

AtiyahElsheikh commented 1 year ago

Here I have control to

  1. provide artificial structure / hierarchies of modules
  2. provide explicit using statements clarifying what comes from where
mhinsch commented 1 year ago

Here I have control to

1. provide artificial structure / hierarchies of modules

2. provide explicit using statements clarifying what comes from where

That's great, but how does that solve the question of how to structure the code in LoneParentsModel.jl?

AtiyahElsheikh commented 1 year ago

You need to re-read carefully again to find out the solution

On Thu, 17 Nov 2022, 11:08 Martin Hinsch, @.***> wrote:

Here I have control to

  1. provide artificial structure / hierarchies of modules

  2. provide explicit using statements clarifying what comes from where

That's great, but how does that solve the question of how to structure the code in LoneParentsModel.jl?

— Reply to this email directly, view it on GitHub https://github.com/MRC-CSO-SPHSU/LoneParentsModel.jl/issues/146#issuecomment-1318392274, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFN4AZUNQ32K3FANJYEQMDWIX7YXANCNFSM6AAAAAARS2VZD4 . You are receiving this because you authored the thread.Message ID: @.***>

mhinsch commented 1 year ago

You need to re-read carefully again to find out the solution

I'm sorry, I don't see it. Could you please explain it to me.