Closed maceligueta closed 6 years ago
@maceligueta can you point us to some example to better understand what does this conversion imply? On the topic of using classes as the main script, I think it is a good idea, but I'd prefer to avoid doing this work twice (now and once the "stages" or whatever is implemented in their place comes along).
Here is a pseudo example:
Original main_script.py is:
import A
import B
def FuntionA():
do_task_in_function_A
def FunctionB()
do_task_in_function_B
def Print():
do_task_in_print_function
do_task_1
do_task_2
FunctionA()
do_task_3
do_task_4
do_task_5
Print()
FunctionB()
do_task_6
do_task_7
Would be converted into:
import A
import B
class Solution(object):
def __init__(self):
do_task_1
do_task_2
def Run()
self.Initialize()
self.RunMainTemporalLoop()
self.Finalize()
def Initialize(self):
self.FunctionA()
do_task_3
do_task_4
def RunMainTemporalLoop(self):
do_task_5
self.Print()
def Finalize(self):
self.FunctionB()
do_task_6
do_task_7
def FuntionA(self):
do_task_in_function_A
def FunctionB(self)
do_task_in_function_B
def Print(self):
do_task_in_print_function
if __name__ == "__main__":
Solution().Run()
Then, I can derive the class 'Solution' to avoid printing results, in a new python file called 'main_script_ready_for_coupling.py':
class Solution(main_script.Solution):
def Print():
pass
I can also call methods from outside, like Initialize and Finalize, and avoid the RunMainTemporalLoop. The good thing is that I would not duplicate any code, and always would be updated with the changes introduced by the developer of the original application.
I discussed this briefly with @adityaghantasala and @msandre and we also like the idea. Esp from the developers point of view it would be convenient to work with. We would however divide the script into more separate functions (e.g. like the processes) in order to be more flexible
I don't know if you do this already in pfem, but I suggest that we use the main-script also for running the tests, (see e.g. what is already done in StructuralMechanics)[https://github.com/KratosMultiphysics/Kratos/blob/master/applications/StructuralMechanicsApplication/tests/Kratos_Execute_Structural_Test.py] This way it would be ensured that the main-script is constantly updated and working
On the other hand we should keep in mind that it makes the code a bit more verbose, esp if one is new to Kratos.
The tests are indeed a good example of a new benefit, and we are using this feature in the DEM Team, too [we were tired of breaking the 'smoke tests' when a small change was introduced in the main script]. About the verbosity, it is true that it gets a bit longer, but IMO it is worth paying the price. Specially considering that the derived scripts are usually really simple and short. Concerning the number of functions and their name, it is something to be worked. They should be enough to provide the needed flexibility for everyone.
for a first attempt we could adapt the functions the processes have, i.e.
ExecuteInitialize
ExecuteBeforeSolutionLoop
ExecuteInitializeSolutionStep
ExecuteFinalizeSolutionStep
ExecuteBeforeOutputStep
ExecuteAfterOutputStep
ExecuteFinalize
Maybe it would make sense to devide the ExecuteInitialize
into more separate functions
Just to add that I agree with the idea of @maceligueta but also support the comment of @jcotela about having a look ahead vision for stages
The discussion at the @KratosMultiphysics/technical-committee raised the following question: considering all the comments in this issue, should the main script and the python solver converge to a unique file? Apparently, some applications have been passing more and more work to the solver (even the modelpart reading!), so one possibility is to extend this transfer of work until the main script practically vanishes.
I don't think so. As @maceligueta exclaimed, I think it is not the place to read the model part nor the place to read materials and perform other operations that do not have any relationship with the solution. Solver must concern only to strategies, schemes and related objects....I don't know the reason why some applications concentrated everything there. I am sure that they have their reasons, may be they can explain them here.
I think that this idea could be extended to have some default workflows defined in the applications
e.g.
class StaticStructuralWorkflow(object):
def __init__(ProjectParameters):
self.ProjectParameters = ProjectParameters
def self.Run():
# read project parameters
# import modelparts
# import materials
# create solution object
# ...
Solution.Run()
# ...
# finalize
like this MainKratos.py could boil down to very few lines for a default case...
I wanted to add the implementation detail to @maceligueta comment and meanwhile I saw the comment of @armingeiser (thanks!). The first idea was to add a Stage
class which has a run method doing the time iteration. But then the question becomes if we can add such method to the solver.
I see the point of @josep-m-carbonell that is quite rare to ask a solver to read a file, but I should say that solver is the only one who knows how many model parts needed and which of them should be read from the input. Also a solver can directly set the output to be generated.
So in my opinion if workflow should know about the solver and work in pair this is adding another layer of complexity and mutual dependency without so much coherence improvement.
+1 for pooyan's summary.
The problem i foresee is deciding which variables should be allocated and which applications need to be loaded, since this is to be done at the beginning of the overall script.
also just to remark that stages should also be designed to allow the easy implementation of optimization loops, so i guess @armingeiser will have a voice
@pooyan-dadvand, I don't see what you say, solver is the only one who knows how many model parts are needed ? why?, the solver is setting the scheme, the builder and the strategy, and then you set the modelpart it has to use.
think of FSI solver for example
This is not so clear, what you call FSI solver could be an FSI main script, calling two different solvers, but taking care of the reading of modelparts and everything the main scripts of the other two applications do as well. IMO, we should focus on the number of layers that are really needed. Should the object which reads modelparts, jsons, prints to gid, cleans memory, etc... be the same object as the solver (one-layer option)? Or should they be separate objects (two-layer option)? and why? I don't have a strong preference, however, it's a good oportunity for me to learn about the Object Oriented philosophy, and how to deal with this issues. I would love to hear good arguments from the gurus.
@pooyan-dadvand @RiccardoRossi, I think I agree with @josep-m-carbonell here: why should the solver decide what model_parts are to be read? It might require a list of model_parts as input, but it should not care how those were constructed...
The way I see it, the python solver used to be a relatively thin wrapper around a C++ Strategy (which is the class that actually does the solution), which knew how to set up the strategy (add the right time scheme, builder and solver...) and the model part (add nodal variables and DOFs).
In the last few years we have started to give it more responsabilities: first process the JSON from the GiD interface, then manage the model part, decide what and when to print (git output/restart), call auxiliary utilities and processes... This has been the case in all applications I use regularly, I don't know if you have the same impression in DEM or the structure side.
Anyway, all these things used to be in the main script, and we have removed it from there and placing it into the solver to standartize the solvers and turn them into "black boxes". If the solver is a real black box, I don't see the need for the extra "main script class" layer, I would just finish moving everything into the solver.
Conversely, if the solver is a thin layer around the C++ Strategy, and we want to have a big main solver class that knows how to do everything, maybe we don't need the solver: the main script can setup the required C++ classes by itself.
I think we are all going in the same direction: one big class that does the main loop. Some call it "solver" and some call it "main script", but in the end it looks like the same thing to me.
If this is the direction we want to follow, I don't see it as a problem. In terms of object orientation, I assume this main script would delegate specific, well defined, tasks to auxiliary classes. Reading the model part is a good example: this class would generate an "input reader" auxiliary class (or maybe one for each of the model parts to be read?) and let it construct the model part. This would allow us to have different "mdpa reader", "hdf5 reader", "restart file reader" classes and so on to transparently manage different kinds of input format. Same goes for writting the output.
Maybe we could have an "old style" very minimal solver class to manage setting up the strategy only, if you want to keep the two levels, but I don't really know how interchangeable would the solvers be in this context: I would go for two levels only if the second level is really modular: if one "main script" always works with the same "solver" and this solver only works with that main script, there is no real reason to separate.
(Sorry for the wall of text)
Thanks for the clear explanation, @jcotela . It is true that the python solver is usually a thin wrapper for the c++ strategy (except for those who have extended its capabilities). Now: is this a sufficient reason to keep it? Should this wrapper exist as an independent object just because it is a wrapper? For the @KratosMultiphysics/dem team, it is so thin that we could easily merge it with the main script. But I don't know about the others, maybe this wrapper (the python solver) does operations that must necessarily be in the wrapper.
My two cents:
I agree with @jcotela. In my opinion we can extend the solvers, add some new methods wrapping other methods.
We need to differentiate between the developers and the users, for the users the unified class will be enough because they don't need anything else, but for developers will be a complete loss to stop using the current main scripts, because the y will lose the possibility of "play" with the model and see what is happening or how Kratos works
We are not very well documented, if we add a new black box level we will become even more obscure and that will be big loss
My preference would be for one layer design unless we find a serious advantage in two layer one.
however, it's a good oportunity for me to learn about the Object Oriented philosophy, and how to deal with this issues.
In theory in oo design we should try to find the most self contained objects with less interdependency between them. In practice this will be done in a try and error way (refactoring). For example the following steps are taken:
Nevertheless in our python scripting we have used a functional approach rather than OO approach until now (passing modelparts to all solvers and processes). This blurs the vision for the first iteration and would also impose strong backward compatibility problem. For this reason my suggestion would be to start with adding run to solver which already has the IO. I should mention that starting with two classes is also a valid option but I don't like the added layer.
Finally, if I don't like the original idea of solver as a factory layer, For that use I would put make it really a abstract factory with very clear behavior and without mixing it with other steps (like redistance) which is done in the middle.
Just to keep record of it all, I would like to write some notes about the talks I had with some Kratos developers and the @KratosMultiphysics/technical-committee. Apparently, there's quite a bit of consensus about converting the main script into a class as a first step towards bigger changes. This first step is useful because clearly prevents duplication of code when coupling different applications and writing tests. The biggest discussion is about merging or not merging the main script with the solver.
Reasons heard for merging (A):
Reasons heard for keeping both objects (B):
Many of these bulletpoints can be refuted, either with a contrary opinion or with counter-examples, but still, they have to be considered and prioritized. My personal opinion goes for B, specially because of B1, because I am used to this structure and I don't see the real need to make bigger changes now. Option A is cleaner, probably a better design, but it would lead to huge files with many heterogenous operations, and would be a bigger effort for the developers.
Hi @maceligueta thanks for summarizing the discussion so far. I understand (but please correct me if I'm wrong) that, when you speak of solvers, you are thinking more on the line of the "classical" solvers, not the new ones as in the FluidDynamics or the Structural application that have expanded responsabilities. Personally, given the current state of the code, I'd go for a single layer (A), because I can't see a clear division between the two objects, but I can see the benefit of (B) if the solver is as minimal as possible: then the "solver" is just an auxiliary class to setup and run an underlying C++ strategy.
One thing I believe will weigh in an eventual decision is to clarify a bit what would go where. For example, when you speak of coupling, are you thinking of coupling as a solver (possibly combining two "base" solvers: FSI solver, DEM + Fluid solver) or as a main script?
Thanks for your help, @jcotela . When I spoke of coupling, I meant two main scripts. That's what I would do (and I did) when coupling different applications in the past. But it is true that some teams chose to insert the coupling within a new compound python solver. Mainly because the main script was considered something more rigid and the solver a big object. So the discussion can be extended to 'where should the coupling between two applications be?'. I unconsciously chose the combination of main scripts, and wrote it. @RiccardoRossi would for sure write a different thing.
+1 for comments of @maceligueta @josep-m-carbonell @jcotela ... about delegating read/write of model part to a separate class. This also came up in the discussion at TUM. It would be beneficial to allow modelpart io to vary independently of the solver/main script imo. The solvers would still need to tell what variables/dofs are needed. Also would be a move closer to the single responsibility principle.
I don't have a strong opinion about the 1 layer vs 2 layer except that it might be nice to have one (and only one) main script per application, but allow more than one solver.
Last comment, it might be worth having all application main scripts derive from an abstract base class using the python module abc to try to maintain a common interface for core functions which should be a part of every main script.
Hi guys, here go my personal observations:
i do believe that a two-object design with a "solver" and a "stage" is a better option. In my opinion the "solver" should be in charge of dealing with the solution OF A SINGLE TIME STEP while the stage should be in charge of orchestrating the time looping etc. My point is that it should be easy to compose objects that know how to solve a time step, while i don't see a straightforward way of composing functions that do internally all the time looping.
Solver interface (as of now):
Solver:
__init__(Model, RelevantParameters ) -- should NOT allocate anything. only store Model, RelevantParameters and do the necessary validation
AddVariables
AddDofs
GetMinimumBufferSize
GetComputingModelPart
ComputeDeltaTime
ImportModelPart -- reads in the model
Initialize -- leaves the strategy ready to be used - designed to be called ONCE
Finalize -- terminates the strategy - designed to be called ONCE
Clear -- weeps out the memory (for example resets the systems to zero)
Solve -- which calls internally the following functions (which can be also called separetely)
Predict
InitializeSolutionStep
SolveSolutionStep
FinalizeSolutionStep
in my opinion the Stage interface should be something like
Stage:
__init__(Model, RelevantParameters) -- should NOT allocate anything. only store Model, RelevantParameters and do the necessary validation
LoadRequiredApplications
Initialize
Run
Finalize
Restart
that is...the state is essentially as a main. however i foresee the main to look something like
Main.py
import KratosMultiphysics
##construct all the stages
Model = KratosMultiphysics.Kernel().GetModel() #if we want it to be somewhere else more than fine
list_of_stages = GenerateStages(Model, "ProjectParameters.json") #calls internall the "LoadRequiredApplications"
for stage in list_of_stages:
stage.Initialize()
stage.Run()
stage.Finalize()
The grace of this design is that a given solver, say FluidDynamics (either FractionalStep or Monolithic) can be reused in defining all of the different stages. Of course this can be done if the stage does not need to know internal details of the solver . This design forces such separation.
As a refinement for this design, we could define an object "Project" designed to live through all the simulation. Model would then be an inner class of Project. Nice thing of this is that Project could store data, so that for example one could dynamically create stages in a optimization process and use the Project to store the optimization progress. In this sense i see an optimization loop as something like.
import KratosMultiphysics
##construct all the stages
Project = KratosMultiphysics.Project()
KratosMultiphysics.Project()
opt = Optimizer(Project ) #let's imagine this exists
while(opt.IsConverges() == False):
#here we define an "optimization epoch"
opt_stages = opt.GenerateStages() --> note that Project is available to everyone
for stage in opt_stages: #here we run, possibly in parallel, through the different optimization scenarios
stage.Initialize()
stage.Run()
stage.Finalize()
opt.PostProcess()
opt.Finialize()
Now note that this is not very different from @jcotela's proposal. the idea is in this case to have stage and solver to be merged together, so that "Run" becomes a new method of the solver itself. my problem with this design is: done this way one may want to only implement the "Run" method, without the "Solve" and its substeps. If that happens, two solvers are definitely not composable...
Hi everyone, First of all thanks @RiccardoRossi for your explanation of the Stages concept. We discussed it at the chair and like it!
We were having a discussion yesterday here at the chair abt the original topic of this issue and this is the summarized outcome (adding to @msandre comment): In a nutshell, we would like to have the Main-Script as a class, too. We would reuse our (already existing) testing script for this purpose, see this commit 9512b531c96100875dfd22828496766617af5de6 for a draft. At the moment we use something like this already for FSI and for optimization, so having a main-class would really assist us in our workflow.
Bringing it into the more global scope, what we mean when we talk about the Main-Class (Main Script wrapped in a class) is in my eyes already what @RiccardoRossi means with a Stage!
Our draft already contains the functions that a Stage would also have, same goes for the initial suggestion of @maceligueta . Furthermore, for coupling Applications, one can use these scripts as "Black-Box" and call the corresponding functions Therefore I think it makes absolute sense to wrap the main-script in a class. This can be done by every application separately as long as a common Interface is being followed (e.g. we could have the functions we use in our draft as base class in the core and every application derives from it). This makes also sense because different applications do it differently at the moment, see above, and I guess it would be difficult to find a common ground that easily. (=> one or two layers)
From the implementational point of view I think it makes sense to also use this Class for the execution of the tests. This way it would be ensured that it is always working / up-to-date.
hi @philbucher, to my understanding what you did is very much in line with the suggestion of @maceligueta. to further refine your proposal, can i ask you to start with an underline the methods which are not intended to be public?
for example a public method would be
def Run(self): ....
but a private one (note that it begins with a _ )
def _RunMainTemporalLoop(self):
...
A short question, in two layer approach the coupled composition in which part goes? solver or stage?
For @RiccardoRossi, the coupling goes to a compound solver, the stage does other things. For me, I have been doing the coupling in a new type of stage (combination of two main scripts or stages). The new type of stage calls methods of the solver for a correct coupling.
Thanks @maceligueta for clarifying. IMO this would be drawback of two layer approach if people start to mixing these compositions in different layers.
Solver I don t see how you could compose two stages into one...
El 26 ene. 2018 9:26 a. m., "Pooyan Dadvand" notifications@github.com escribió:
A short question, in two layer approach the coupled composition in which part goes? solver or stage?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/KratosMultiphysics/Kratos/issues/1278#issuecomment-360715007, or mute the thread https://github.com/notifications/unsubscribe-auth/AHr7ETRdA0k4X1anT8EDu7_EiNpPNASjks5tOYxOgaJpZM4RZiSM .
@RiccardoRossi I renamed the functions
@RiccardoRossi: we are using stage and main_script to talk about the same object. Obviously, you can compose two stages of the same kind (both dynamic, both static), understanding stages as 'main scripts' coming from different apps.
@miguelangel you can use two stages one after the other, but you cannot compose them to do an interaction solver for example
It would be an interaction stage, actually. I am not getting you...
@RiccardoRossi I also don't get it. How would you do e.g. FSI (or any other coupled simulation) in your approach?
@maceligueta think of the following situation:
how would u implement the Run for all the possible combinations to do an FSI solver? (imagine that the coupling is to be done identically for all of the solvers) My proposal is to define a coupled solver, which constructs a fluid_solver and a structural_solver and does not know the details on how they work internally. Then the coupled solver itself defines a function "Solve" which is then used by a dynamic_stage which is external to the solver itself.
@philbucher what i wrote hopefully also answers you...
just to report what i understand as @jcotela 's idea (Jordi, correct me if i got it wrong):
Solver uses "strategy" etc as building block. It differentiates from the strategy in that it knows the physics
actually if we import the applications here the loading is correctly handled, so we don't need the LoadApplications method
Solver:
__init__(Model, RelevantParameters ) -- should NOT allocate anything. only store Model, RelevantParameters and do the necessary validation
- io --> to be implemented in the main class and eventually customized
- time stepping --> also implemented in the base class and eventually customized
- list of INPUT-DEFINED processes --> to be constructed from the input
- other processes --> to be initialized internally as needed
AddVariables
AddDofs
GetMinimumBufferSize
GetComputingModelPart
ComputeDeltaTime
ImportModelPart -- reads in the model
Initialize -- leaves the strategy ready to be used - designed to be called ONCE
here we leave everything ready for running it and we call as needed the
AddVariables
ImportModelPart
AddDofs
Finalize -- terminates the strategy - designed to be called ONCE
here we wipe out the memory (for example resets the systems to zero)
SolveStep -- actually it could be a good idea to take out the function solve and only leave the inside functions
Predict
InitializeSolutionStep
SolveSolutionStep
FinalizeSolutionStep
Run
solves the complete analysis
then the main would look something like
Main.py
import KratosMultiphysics
##construct all the stages
Model = KratosMultiphysics.Kernel().GetModel() #if we want it to be somewhere else more than fine
list_of_solvers = GenerateSolvers(Model, "ProjectParameters.json") #internally loads the applications needed
for solver in list_of_solvers:
solver.Initialize()
solver.Run()
solver.Finalize()
@RiccardoRossi , I don't really see what's the problem with the coupling stage. You can always do:
def Run:
while time < final_time:
structuresolver.Predict()
fluidsolver.Predict()
structuresolver.InitializeSolutionStep()
fluidsolver.InitializeSolutionStep()
structuresolver.SolveSolutionStep()
etc...
These operations I presume would be written in your coupling solver, as far as I understand. You have to call them somewhere, in the correct order. I don't get why you find it correct to put them in a 'coupling solver' but incorrect in a 'coupling stage'. For me they are equivalent. We are actually using the first approach currently.
Just a short comment in the line of this discussion. As @philbucher already mentioned, as far as we got the stages concept, we liked it very much. However, what personally I found more and more confusing is the terminology.
Assuming I am a user and writing a main script, where I want to define a process composed of a few different analyses (structure, fluid, geometry...) depending and feeding themselves in some arbitrary order. Following the discussion above, the different analyses within this main script would be objects within a "list_of_solvers". So effectively these analyses would be called solvers.
At the same time, every analysis would contain its own python solver, which controls the actual physical solution (say e.g. the "structural_mechanics_static_solver), for which it uses a solver strategy based on e.g. a ResidualBasedBlockBuilderAndSolver. So we have another two objects called solver.
On top of this, every physical solver probably requires an algorithm for the solution of the linear system, which within Kratos is again another instance called solver, i.e. the linear solver.
Finally, if the process which I as a user want to build up is e.g. representing an experimental FSI analyses, I would be inclined to call my entire process an FSI solver.
So eventually, I am left with four to five levels of solvers which really doesn't help the understanding, application or development of the code.
Having said this, I just feel like it might help to be a bit more strict with the terminology and responsibility of the different layers. So if on top of the linear solver and the physical solver, there is now something that runs the physical analyses (which I guess would be exactly corresponding to a stage), let's call it accordingly, e.g. StructuralAnalysis or StructuralAnalysisStage rather than StructureSolver. This makes it far easier to read and presents the code structure in a much more logical way.
Thanks, @dbaumgaertner for pointing this out. You are right that the word 'solver' is used for many different things. So, here is a list of how I see the terminology of the python files and objects:
Launcher script - Contains very few lines, and basically calls a Stage or list of Stages. It might contain loops for launching cases of a list or iteratively. This Launcher does not exist in most of the aplications, because the Stage is called directly.
Stage (or Main Script) - Contains general operations about the case. Usually read/write, creates and calls processes, creates and calls the Strategies, contains the temporal loop, etc...
Strategy (commonly written as Solver) - A Python object that wraps the C++ Strategy. Solves the equations. Lately, it has been including more operations, like Reading the ModelPart.
The proposal of this issue was to convert all Main Scripts from all applications from plain scripts to classes. The discussion, however, has moved to other interesting topics.
I like that idea about the launcher script. It should be the one stored in the GUI, so the app developers who change python scripts should not update the GUI script
First of all I should say that I agree with the point @dbaumgaertner mentioning. It is very confusing many layers of solvers and especially for beginners who start to use Kratos via python. So a better naming and distribution of task should help here. (Another option would be to reduce them)
About the layers I saw them slightly different from @maceligueta :
ApplyBoundaryCondition
, MeshMovement
etc.)Scheme
and BuilderAndSolver
For me a black box coupling would go in stage and a more cohesive one can be written as solver but I have not an strong argument on this.
Hi @pooyan-dadvand @maceligueta
i have been thinking quite a lot about this. I am now swinging towards merging "Stage" and "Solver", enforcing however some kind of internal interface.
The rationale is that for example i have no way of making substepping possible through an API, without having Stage to need to know the internals of "solver".
At this point we could mount an hierarchy like the following (i'll call "StageSolver" the merged class, even though i'll admit it is not a good name as @dbaumgaertner observed) The idea is that this way we could define hierarchically an interface that fits the needs of the problem, like ApplyBoundaryConditions or what we consider needed
this subdivision is of course arbitrary. What i am trying to describe is that nested levels will have to employ the interface required by the parent's levels, as a sort of "protected API" (i would employ the abstractmethod decorator to enforce this in python). My idea is thus that the FEMSolver will define its IO in a certain way (which i hope will be matching for example for Fluid and Structure, but may be different for DEM).
More discussion about the names of the functions in #1711
PR for the BaseClass see #1780
an agreement has been reached. closing
In the last year, the @KratosMultiphysics/dem team and the @KratosMultiphysics/pfem team have converted their scripts from plain script to python Classes. The reason behind is the possibility to derive the main script to modify the behavior of parts of it when coupling applications, avoiding any duplication of code. For instance, removing the GiD prints, or changing any functionality that can be annoying, useless or necessary for the coupling with other applications. I opened this issue to encourage the rest of the @KratosMultiphysics/team-maintainers to do it as well, so others can use their scripts for coupling (recently I had to do it for the Fluid Dynamics, but I made my own design for that). Maybe @RiccardoRossi would like to wait until the design of the 'computation stages' is ready, so this is open for discussion, but we are already coupling applications everyday, and deriving the main scripts is a powerful tool that should be considered.