erdc / proteus

A computational methods and simulation toolkit
http://proteustoolkit.org
MIT License
88 stars 56 forks source link

What's the future of SpatialTools etc.? #428

Closed nerisaurus closed 6 years ago

nerisaurus commented 8 years ago

@cekees @tridelat @adimako

As I've been planning out updates to the air-water-vv repository, I've noticed a lot of new developments in:

interfaces (I've especially been paying attention to developments in #397, mostly because the new documentation is really helpful).

Along with some minor questions about the current implementation, my main ill-defined question is what do we want the code-flow of an ideal proteus problem to look like while it's using these tools and how can I help out? The air-water-vv repository I'm working with has a whole lot of code following dozens of different paradigms in proteus design code-flows. Most of them are rather ad-hoc (for understandable reasons), leading to tests which quickly become obsoleted and break down (see erdc-cm/air-water-vv#105 for illustration).

I have a variety of aforementioned questions and suggestions which I'll put in individual comments below for easier discussion. My suggestions are a lot of work that I don't want to force other people to do, but I'm mentioning them instead of working on them because I don't want to unilaterally make major changes to the functionality while there's so much simultaneous development of the tools, and while other people are still working out how to use the tools in projects.

nerisaurus commented 8 years ago

setDimensions seems to be doing too much

In individual Shape classes (in both proteus/SpatialTools and proteus/mprans/SpatialTools) like Rectangle or Tank3D, the convention appears to be that there is one core class method called setDimensions.

Aside from taking an input and setting it into self.dim for the shape, the method also creates all the shape's geometry: vertices, segments, facets, regions, boundary tags...

I feel this is rather unintuitive. The name setDimensions doesn't suggest that it will do this, and it ends up getting called by tons of methods in the class as they all rebuild the geometry after setting up their new parameters, with each one having to reset the dimensions when they do so (leading to the code repeatedly doing, basically, self.dim = dim = self.dim).

I think some of the calls to rebuild the geometry may be a bit excessive, but I can understand that the redundancy makes it easier for users to change things and not have to worry about whether or not they reconstructed the shape - and the construction even of a complicated shape is probably not expensive compared to what will happen afterwards.

If we did want to change the convention so that the geometry isn't rebuilt, I think it's a lot safer to move that logic out of the setDimensions method (I'd suggest constructShape) to make the process more intuitive (the user sets dimensions, sets all these other factors, then calls constructShape whenever they might need a finished geometry). Even if we keep the rebuilding, I think moving this logic to its own method would be valuable for later code comprehension.

In fact, I think it would even be advisable to break up the geometry logic into multiple sub-methods (vectors, regions, holes, etc.) in order to make testing changes and replacing methods in new subclasses easier. Then the constructShape or similar function

The way the code is currently set up doesn't break up easily, but the logic can be rewritten so that it does.

nerisaurus commented 8 years ago

Sponge Zones

A quick question - the sponge zones extend out of the tank now and that's final now, right?

nerisaurus commented 8 years ago

Odd Assertion Checks

Why does _checkNd (in the core proteus/SpatialTools shape) check if the array is self.nd or 3? This means that putting in a 2D array before correctly setting up the dimension of a 2D shape will throw an error, but putting in a 3D array before correcting setting up the dimension of a 3D shape will not. It also means that _checkListOfLists could throw an error if given a list starting with an array that _checkNd accepted - this feels wrong, unless there's a deeper reason for the check.

On a similar note, _checkListOfLists actually only checks if the first list in the list has the correct dimension. I'd prefer if it looped through the whole list, but maybe that could end up being too expensive? If so, I think the behavior of the check was mentioned in the docstring.

nerisaurus commented 8 years ago

Some Extensions

For the air-water-vv problems I've been working on, I've been making and planning to make a few subclasses with new features. I hope to make a series of subclasses of 2D tank for tanks with obstacles or weirs of (mostly) arbitrary shape, with some added functionality for:

Does anyone else have uses for these (and/or ideas of how they could be done much easier already)?

nerisaurus commented 8 years ago

BoundaryConditions from a Shape

What's the advised convention for using boundary conditions from a shape/domain (as set with BoundaryConditions) into _p files?

tridelat commented 8 years ago

@nerisaurus thanks for the feedback!

setDimensions seems to be doing too much

setDimensions only changes the coordinates of the vertices for simple shapes like Rectangle or Cuboid. For the tank classes, it will also change the segments/facets/regions/flags because those depend on whether or not sponge layers were defined before calling setDimensions. We could move the logic outside of the setDimensions and have a separated constructShape function if it helps.

SpongeZones

Yes, they extend out of the tank now and it should be final as it seems more intuitive to use and makes the implementation logic clearer.

Odd Assertion Checks

I think the idea behind the assertion of self.nd in _CheckNd was that it would be possible to use a shape classed as 2D (like a surface) in a 3D domain, but I have never used this. Also, even a surface in a 3D domain should be considered a 3D shape as it would still need (x, y, z) for vertices, so I think it is safe to remove len(array) == 3. For _checkListofLists, we could actually just check the shape of the numpy array instead of iterating through the whole list.

Some Extensions

A simpler way to set up Gauges through SpatialTools (calling Gauges.py) would be nice. New boundary conditions can be added to mprans/BoundaryConditions.pyx. Those modules are work in progress, so additions/improvements are welcome!

BoundaryConditions from a Shape

I will upload a test case using the latest pyx version of BoundaryContions on the air-water-vv repository when I get access to my proteus installation (later today hopefully)

tridelat commented 8 years ago

Concerning the general idea behind SpatialTools and BoundaryConditions is that they are being used to make it more intuitive to set up, read, and manage a test case (in the context of two-phase flow for the mprans versions). Also, boundary conditions that are commonly used in air-water-vv can be stored in mprans.BoundaryConditions (e.g. no-slip, free-slip) in order to be sure to use them consistently in the various air-water-vv test cases. It also reduces the need to go through all the *_p.py files when changing boundary conditions (which can become a complex process when many different BCs are defined).

Another addition that would be welcome at a later stage perhaps is grouping the numerical options per themes in a class for easier management, without altering the functionality and flexibility of Proteus. All of the parameters would still be modifiable through the Context module.

A very rough idea could be to have something like this in the end (where the squares could be a class): image

where, for example, global physical constants (gravity, viscosity, etc..) could be stored in a class that would be accessible from BoundaryConditions or other modules...

tridelat commented 8 years ago

I pushed a new branch with an example of a generic tank using SpatialTools and the latest BoundaryConditions.pyx there: https://github.com/erdc-cm/air-water-vv/tree/tridelat/pyx_wave_tank/2d/wave_tank

Another example using a tank and a floating body can be found here (not using the .pyx versions, works with current proteus master branch): https://github.com/erdc-cm/air-water-vv/tree/tridelat/caisson_oscillation/2d/floating_caisson

nerisaurus commented 8 years ago

Thanks for the responses, @tridelat .

setDimensions

I do think separating out the logic to constructShape would be a good idea. I don't have a really solid argument for why you should do that bit of extra work, but I think it makes the code flow more intuitively and it could be easier to change how the code works out later on (maybe sometimes we need to set or reset self.dim and we don't want to rebuild all the segments each time - or perhaps we want the user to be able to constructShape from outside the object without having to pull out the object's dimensions to re-pass into it - or we just might run into shapes where we don't want to construct the shape or set the dimensions more times than we have to because they become a little expensive).

Extensions

I'll work on a Gauge setup for my problem specific tank class at the moment, and if I can get it working I'll bring the methods up to ShapeRANS (well, it might require me to start from there too, as I think I'd want to crack into the function assembleAuxiliaryVariables a little bit)

When I mention special boundary conditions, I mean being able to specify particular segments as having a different boundary condition than normally would be applied (e.g. an air vent is on a surface that would usually get the "y-" treatment - probably free slip - but instead gets some sort of incoming flow) - I haven't run into any issues where I'd need more than is already in BoundaryConditions yet.

BoundaryCondition examples

Thanks for the examples - I think I understand it now. This is just piped through from proteus.default_p import *? And is the only difference for the cython ones the.init_cython()` at the end of each of the calls?

I notice that these examples, like my case, have redist_p and ls_p which aren't covered by the setup in BoundaryConditions. While your examples have trivial boundary conditions there, I at least have a simple but not trivial boundary condition in ls_p (specifically:

def getDBC_ls(x,flag):
    if flag == boundaryTags['right']:
        return lambda x,t: x[1] - waterLine_z
    elif flag == boundaryTags['left']:
        return lambda x,t: x[1] - waterLine_z

) Are these properties too niche or specific for inclusion in BoundaryConditions, or just not a priority at the moment since they remain fairly simple?

Closing thoughts:

I think you have a good thing going in the new plans - a lot of the issues I've been finding in the old style of things get addressed very nicely. I'll (probably very slowly) be poking around at adding a few additions here and there as I get them working in my specific cases. Probably Gauges, maybe some regional boundaries, although that has other issues. I can see about some of the other class structures you mention above if I run into good uses for them, although as someone with little prior experience in CFD I'm not always very confident that my selections of variables are complete and comprehensive for a particular concept (like picking only and all potentially relevant physical constants) - and if you have any more suggestions about things that could be built up easily I'd be glad to take on distractions from my main tasks (it keeps me from getting frustrated when the main task isn't working).

Footnote: Regional Boundaries

"regional boundaries" is perhaps where the diagram above breaks down a little - at least from my current plans. The idea of this system is that the user sometimes wants to designate specific areas to get different treatment.

One such treatment is sponge layers, which are already taken care of here. Another use that I have from old air-water-vv tests is a variable meshing scheme, where they want to put a more refined mesh in a region of interest.

This requires me to be able to specifically tag regions based on nonphysical boundaries, which may or may not intersect with various physical boundaries. I have an algorithm for this (at least in 1 dimensional region splitting - 2D or 3D regions would bring in much more complicated algorithms because polygon overlap is nontrivial), but it has forced me to abandon the Many Shapes / One Domain principle that the above diagram uses (replacing it with One Shape / One Domain).

Thus my Weir Tank class is based around making several different distinct geometries within one shape, because I need access to all of them to build in regional boundaries that don't have segments crossing any shapes (because I don't think the triangulation can handle that responsibly).

I feel this is a little suboptimal, as just building from simple shapes seems really intuitive, but I think to protect the triangulation (I'm also worried about putting shapes which share a side) any fix would have to be in the Domain setup, and would probably involve polygon intersection (since we'd be trying to see where several at-least-2D shapes touched). Maybe I'm missing an easier technique, or maybe I'm too worried about the robustness of the triangulation code.

tridelat commented 8 years ago

For the pyx version of BoundaryConditions, the following loop was added in the _so file because some boundary conditions rely on numerical parameters set in the main (context) file:

for bc in ct.domain.bc:
    bc.getContext()

(just realised that maybe it could be a class method instead a loop here, but it is called only once so it is not costly) For example, relaxation zones require a variable from the context called he (characteristic mesh element size around the free surface) and ecH (called epsFact_consrv_heaviside in most test cases). This is kind of a hack for now, it would be nicer to have a numerical parameters dictionary consistent across all two-phase flow test cases. Also, with the current pyx version, if you want to set a custom boundary condition function, it is done by setting the uOfXT variable as follows (in the main .py file):

myShape.BC['mytag1'].u_diffusive.uOfXT = lambda x, t: 0.
myShape.BC['mytag1'].vof_dirichlet.uOfXT = lambda x, t: 1.

redist and ls could be added to the BC_RANS class for convenience if they need to be set for some test cases.

Concerning SpatialTools, duplicate vertices are removed automatically when calling assembleDomain, but not duplicate segments. It can still be used to make a domain with closed geometry by not defining a shared segment of one the shapes, like in @Giovanni-Cozzuto-1989 case here (even though there is a modified version of SpatialTools in this folder, this functionality is available in the Proteus SpatialTools) https://github.com/erdc-cm/air-water-vv/blob/46d5e221de3929b49621987d8a6e88764bcc3cdd/2d/Waves_movingCaisson/tank.py (the vertices have to be exactly the same, so it is something to look out for when they come from different calculations). Removing duplicate segments could be a nice addition, but might be costly (maybe there are external python packages we could use to do such things?)

The several shape approach is useful when you have one or more separated bodies (like a caisson or several floating bodies imported from STL files for example), and can be used to assign specific roles to specific shapes (like “converting” it to a rigid body). For tanks that have a rather unique geometry, you can use the CustomShape class (if you define only one shape, it is essentially your domain), and use BoundaryConditions through it.

About the regional boundaries, is there an example of what you are trying to achieve? Gmsh is very nice for refining regions of the domain, and its implementation within Proteus should be finalised soon.

nerisaurus commented 8 years ago

An example of these regional boundaries, and the one that I've specifically start to try to convert and generalize, is in 2D/broad_crested_weir_VM .

It manually defines 4 regions and adjusts the regionConstraints in each region to be different (so that it can have a very fine mesh over the weir itself). The broad_crested_weir_AV case also does this, although it primarily adds in additional regions so as to add in sponge layers. It seems like it tested putting in different mesh size constraints over different regions (not all the boundaries are for sponge layers), but decided against it later.

That sort of illustrates why defining it manually (in the new setup it would save some work by calling CustomShape after setting up its form) isn't a great solution. Whenever a small detail is changed, all the vertex, segment, and region logic must be rewritten and hand-calculated - and might require changing the choice of shape class entirely.

I want to be able to do this automatically, so I'm writing code into my new Shape based class that lets the user give any reasonable (i.e. within tank bounds) list of x coordinates between regions and has the method add in vertices and segments to separate these regions (I can only do this in a single dimension at once for now because 2D shape collision is orders of magnitude more complicated). When the imaginary region boundary line would intersect a segment, it places a vertex at he intersection point and replaces the segment with new connections around it. It then builds segments inside the used area of the domain (and not in the empty open shapes, because that could make it wrongfully remove holes) and is able to place regional attributes accordingly.

The problem is that it now needs to know all the segments in the domain which it could run into, which for my current design means the shape has to contain all shapes (and I have ways to add in relatively complicated geometry fairly easily, but it's duplicating what could be done with multiple shapes - and my problems don't care about moving obstacles, which could make it more complicated).

So the major issue is the problem of intersecting segments, specifically between the "real" geometry of the tank/obstacles and the "imaginary" geometry that divides the domain into different regions. My technique to solve this problem needs to know more information than one shape alone could. (but maybe the Gmsh changes will fix this - I'll look more into that now)

tridelat commented 8 years ago

@nerisaurus could the intersection points / new segments be added after (or in) the assembleDomain function (or _assembleGeometry)? Once this function is called, the geometrical information of all shapes is contained within the domain class instance. In any case, gmsh would probably be more convenient for refining regions as it does not need to have new segments dividing the domain. You can for example define a box with (x, y, z) coordinates where the mesh will be refined, and this box can be overlapping different regions/segments without problems. Functions can also be used to make the refinement transition smoother

Here is an example of mesh refinement using gmsh with 3 regions, and a floating body. image

nerisaurus commented 8 years ago

@tridelat , theoretically the intersections/segments could be added directly "inside" the domain's logic itself. Several issues came up, though. First, carrying over that information into the domain is a tiny bit more complicated [1].

As a larger issue, I was using various tricks to identify particular segments in the grid [2]. Looking back at this again after your comment, I notice that due to other reasons I had to abandon most of those tricks and generalize the algorithm... so it probably now wouldn't be a problem.

I also was also aiming to avoid making any major changes directly to SpatialTools... but again, other parts of the problem (specifically, Gauges) forced me to abandon that, so that's not a problem.

There still would be an issue of carrying the information into assembleDomain. It must be called after or inside of _assembleGeometry, so either:

  1. The information about these regional boundaries is sneaked inside the domain beforehand and stored away somewhere (including potentially being added as a "shape" with some special marker).
  2. The information is passed into the assembleDomain function (as an optional argument).
  3. assembleDomain is called twice, with a region constructor function in the middle.

I think 2 is the best of those (3. is cumbersome [3] and 1. feels less intuitive to me). However, Gmsh sounds like it might make that decision irrelevant. Do we have ways to use it from functions and/or do we consider it easy enough to use that we don't need to support shapes editable by context options (because the user should just build their own in Gmsh and then just put in the name of the file to import as the context option), though?

I think my current plan, because the re-segmenting code is tricky to replant and this subproject has taken awhile already, will be to push through with my complicated special shape class until it's fully presentable and push the associated changes into master (both in proteus and air-water-vv) so that it can get merged into all the other changes - then I'll return to move the region logic into the domain (or to nowhere, if Gmsh can handle everything).

[1] More checks (or rather, more checks that aren't already built and used for something else) needed to make sure things match up.

[2] And I would have lost the contextual information needed to detect these segments after I dumped all the shapes into the domain.

[3] And unlike most of the repeated function calls I'm uncomfortable with in these modules, this one might actually be a little bit computationally costly.

tridelat commented 8 years ago

Options 2 seems best. But yes, gmsh allows to make it in a much easier way.

A lot can be done in gmsh but it might not be ideal for building complicated geometries and the Context options to modify shapes are a must I think, as it makes it easy to modify a test case. Maybe the best would be to have both options implemented (importing a geo file, or creating one through proteus). Also, keeping a python/proteus interface is convenient for auxiliary variables like relaxation zones or rigid bodies (that will most probably be done through another library: Chrono) and even boundary conditions that depend on flags. By importing a geo file directly, it might become cumbersome to link the facets/segments to the auxiliary variables, especially if there are major changes possible within a test case (such as adding/removing an absorption zone).

I was thinking about having mesh options that could be called through functions like setRefinementBox, setRefinementFunction, refineFacets, etc. and that would write the relevant lines in the geo file. I have used such functions (an old (unclean) version here https://github.com/erdc-cm/air-water-vv/blob/tridelat/gmsh_mesh_control/2d/gmsh_absorption_coarsening/tank2D.py) and it is easy to modify through the context options without having to manually modify the geo file.

I am not entirely sure about what is the route to take on the gmsh/proteus/python interface implementation, maybe @cekees can advise?

nerisaurus commented 8 years ago

If refineBox (and setRefinementFunction too) really works and handles intersecting polygons well, then I'm all for using Gmsh for that application. As far as I understand it, the process would be:

Python Pipeline

  1. Collect the user's input from Context options.
  2. Use SpatialTools and Domain to collect all the information and intelligently attach tags for non-mesh related information (boundary conditions, gauges, relaxation zones, and so on - tags we want to have around to detect later)
  3. Apply further tags for Gmsh support using functions like refineBox
  4. Call Gmsh to build the domain.
  5. All the rest of Proteus.

Semi-Python Pipeline

  1. Build a custom domain with Gmsh (however the user prefers to do that)
  2. Potentially add some useful tags (for boundary conditions etc.) along the way while using Gmsh or using some other pre-processing utility.
  3. Collect the user's file from Context options (which would override the normal generation and ignore all the mesh based commands in the options)
  4. Potentially detect and place some tags with code that reads bits of the mesh.
  5. Apply further gmsh related tags (like refineBox) and call gmsh to adapt the mesh if needed.
  6. All the rest of Proteus.

I didn't put question marks in those lists, but there's definitely a lot of points which seem a bit up in the air. It does seem like something to discuss with @cekees and get the design of the pipelines settled on. Maybe something to bring up at the next meeting?

(by the way, this Chrono? Or some other one?)

tridelat commented 8 years ago

refineBox and setRefinementFunction really works, I have tested it today by refining around the free surface and a floating caisson, converted the .msh files to Triangle files, and Proteus happily ran the simulation.

Python Pipeline

This is pretty much implemented now, and it works the way you described it. I pushed an early version (https://github.com/erdc-cm/proteus/pull/432). The only thing we would need to think about is how to build the interface for refinement options. I find this option more attractive as it allows the user to stay in the main file: the more pythonic, the better! Also, a set of generic functions could be used for refinement using different meshing software (gmsh, tetgen, triangle, etc.) when writing in the relevant files (.geo, .poly, etc.)

Semi-Python Pipeline

So here, assuming the user defined all its geometry in gmsh, the (ASCII) .geo file could be read through python and imported into a Domain class instance. The latest versions of gmsh allow the naming of "Physical Entities" like this (http://gmsh.info/doc/texinfo/gmsh.html#t1_002egeo):

Physical Line("my_tag (1)") = {3, 4, 5}  # flag is between brackets

BoundaryCondition instances could still be generated by reading the file (1 per unique flag). I see 2 possibilities from here:

A last option would be to define the geometry through python with Domain/SpatialTools, a .geo file is generated with all the spatial information. The user could then manually appends its refinement options to the .geo file. (this appoach is like the pythonic approach but without an interface for refinement and would still suffer from geometrical changes).

nerisaurus commented 8 years ago

So for the semi-python pipeline, I think having named physical entities is probably the most Gmsh-ic way to do things. To avoid forcing the user to guess tag names, since they're going to have to touch python anyway, I think it might be best to let the user set labels.

A user could then name a physical entity "RigidBoddy" and then in their main code tell a hypothetical gmsh enabled Domain (or potentially Shape, depending on which is better to extend to pick up such an object) object to setRigidBody(entity_name = 'RigidBoddy'). The user then defines all the tag names, so they aren't stuck trying to figure out the magic name we gave it, and all the python code stays in the main file with short, straightforward functions.

Since it does fail on typos, it should probably collect all the physical entity names in the .geo file. Then it can check if the entered name is in the list and raise an error if it isn't. The error message can include the list of all known physical entity names, so that the user isn't lost as to where they made the typo.

The fully python pipeline is a lot nicer, I'll agree, and should be the suggested pipeline. I'll build around that as the goal.

tridelat commented 6 years ago

Closing this issue, see #801 and #806 for the implementation of a 2 phase flow problem class that implements a similar approach as discussed above (similar to the diagram drawn above)