Open johnomotani opened 4 years ago
Help wanted!
Unfortunately I don't see myself having any time to work on this any time soon. I'd be very happy to review some PRs though!
I'm definitely in favour of trying to reduce the amount of code in Coordinates
-- it's quite complicated now!
I'm broadly in favour of your proposal, though I wonder if it might make sense to have an option to fall back to copying the last interior grid points into the boundaries? This would let you do something with a grid file without boundaries.
If the tool can take a grid file that works with BOUT++ v4.x and make it work with v5.x with minimal fuss, something like hypnotoad --to-v5 old_grid.nc
, then that makes it much less painful for users.
Essentially just moving the code to hypnotoad
seems like a good first step that does that conversion. I also don't have much time, but I will see if I can bash that out at some point.
it might make sense to have an option to fall back to copying the last interior grid points into the boundaries? This would let you do something with a grid file without boundaries.
Sounds like a good idea, especially for compatibility with existing tests/examples. I'd be in favour of the default option being to raise an error, so the user has to explicitly set the boundary cells this way.
Agree the construction of Coordinates needs tidying up. One worry is that PR #2025 (3D metrics) also makes changes to Coordinates, so merging might be tricky...
To avoid breaking too many things, perhaps we can make all members of Coordinates
const
, so they can't be changed in inconsistent ways. This also means that the constructor can't really do anything except initialise the values.
To create a Coordinates, we could have classes for the separate pieces. It would be nice if these were grouped together in ways which corresponded to mathematical objects:
MeshSpacing
with dx
, dy
and dz
, along with their derivatives (d1_dx, d1_dy)ContravariantMetric
with g11, g22, ...CovariantMetric
with g_11, g_22, ...DerivedMetrics
or maybe ConnectionCoefficients
with G1_11
, ..., G1
, G2
J
and Bxy
. Perhaps also in DerivedMetrics
.This breaks up the construction, which is currently one giant function, into separate pieces. Different combinations of these could be passed to the Coordinates
constructor, e.g. if CovariantMetric
isn't given then construct it from ContravariantMetric
.
Merging #2025 would solve that issue :-)
For the coordinates it might be easier to move everything to a pre-processing script, that writes out all the required quantities.
That way BOUT++ can concentrate on solving the differential equations, and the preprocessing tool can guess what the user might want to use - or maybe even ask in an interactive fashion. Also it wouldn't need to work in parallel, which would simplify some things (communicate before the mesh is initialised is as far as I remember a bit tricky)
I'm very much on-board with splitting it up into smaller sub-objects.
Rather than making the members const
, I'd prefer to have getters/setters that can make sure everything is consistent. So calling coords->setContravariantMetric(contravariant_meric)
would recalculate the covariant metric. const
members make objects not behave terribly well in std containers for instance.
Getters/setters also allow us to put off calculation of some of the geometry until the first time we need it. For instance, G1_11
etc are not needed in many simulations, so we could just not calculate them until the first coords->getConnectionCoefficients()
A preprocessing tool would be nice, but not as a replacement for doing it all inside BOUT++.
const
members would break things like assignment operators. It would avoid breaking code that uses coords->g_22
but I suppose if we're breaking things then getters would be ok. Calculating things only on demand is a nice benefit.coords->setContravariantMetric(contravariant_meric)
or
coords = coords->withContravariantMetric(contravariant_meric)
?This Coordinates
stuff is painful. As a plan of attack, how about:
next
(PR #2167) next
with the new tests into 3D coordinates branch (PR #2025), and make sure that we're not breaking anything. If all is ok, add more tests and merge #2025 into next
. Coordinates
into separate objects
To normalise the metrics and derived quantities I think only needs:
The normalisation of lengthThe normalisation of dx, dy and dz
Just a comment, I suspect it shouldn't make any difference in the end, but in STORM we actually normalise the metric not just with lengths - we normalise 'x' with something like B0/(rho_s**2)
and then that feeds through the metrics... I'm not opposed to converting to just using lengths (that could be more natural anyway), but to keep doing what we're currently doing we'd need the option to do a more general normalisation.
Yes I think that's quite common, to have dx
a poloidal flux. I think that could be handled ok. We just need to be able to derive the normalisation of metrics. We know that
length^2 = g_ij dx^i dx^j
so by giving the normalisation of length, together with the normalisation of dx, dy, dz, all other things can (I think) be normalised consistently. Typically dy and dz are not normalised (since they're angles) so the normalisation might be:
coords->applyNormalisation(rho_s, B0 * rho_s**2, 1.0, 1.0) // length, dx, dy, dz
Running into issues around staggered coordinates, and calls to Mesh::recalculateStaggeredCoordinates()
segfaulting (coordinates null in Mesh
). Rather than having Mesh
holding the different staggered Coordinates, and having to have this callback to recalculate, would it make sense to have the staggered versions of the metrics inside Coordinates
, so they can be invalidated and recalculated?
Coordinates
from a Mesh
(or Options
), if data for the staggered metrics is included, then create them. If data is not included, then only calculate them on request.metric = coords->getCovariantMetric(CELL_YLOW);
Rather than having Mesh holding the different staggered Coordinates, and having to have this callback to recalculate, would it make sense to have the staggered versions of the metrics inside Coordinates, so they can be invalidated and recalculated?
I think the original motivation for the callback was that depending on what the user is doing we might or might not need to invalidate the staggered coordinates: if I have only CELL_CENTRE metrics in the grid file, then it makes sense to normalise them and then (re-)calculate everything else; on the other hand if I had the staggered metrics, then I probably want to normalise both the CELL_CENTRE and staggered coordinates, so don't want the staggered ones to be re-calculated.
If moving the staggered coordinates inside the 'standard' CELL_CENTRE ones simplifies this logic then it could be a good idea. Possible con is it adds a layer to the object hierarchy (mesh->coords_centre->staggered_coords instead of just mesh->coords). If we go in the end with the design I suggested at the top, where staggered and unstaggered are both always read from the grid then staggered Coordinates
don't depend on the unstaggered ones - we get rid of all the code that interpolates/extrapolates - and then the current std::vector<Coordinates>
in Mesh
would seem like the natural thing to do. @bendudson is the refactoring you're doing urgent, or could we do something like: 1) merge 3d metrics; 2) simplify the Coordinates
constructor by requiring more stuff to be calculated/saved by the grid generator; 3) refactor into more modular classes?
Hi @johnomotani , no it's not really urgent. While fixing code for J < 0
coordinates the urge to tidy became irresistible. I think I agree with your proposal (and @dschwoerer about pre-processing): If staggered grids are used, then the grid file input should have them. The more I look at the possible inter-dependencies and special cases, the more tricky it looks.
Some questions:
Certainly merging 3d coordinates should be first, or merging later will be awful.
Definitely agree with simplifying the constructor. I'd like to have ways of making coordinates without a grid file, if the user of the code (or unit test) passes in the pieces needed.
Ooh, or to simplify Mesh
as much as possible, we could have a very lightweight class to group together the staggered and unstaggered Coordinates
, like a CoordinatesContainer
. I'm thinking of a case where we don't allow interpolating staggered from unstaggered, so basically the only things that need doing are getting the Coordinates
(at a given location) or normalising the coordinates. So the CoordinatesContainer
could provide a getter and a method that applies normalisation to all the Coordinates
. Something like:
class Mesh {
...
Coordinates& getCoordinates(CELL_LOC location = CELL_CENTRE) {
return coords_container.get(location);
}
void normaliseCoordinates(BoutReal length_norm, BoutReal dx_norm = 1.0, BoutReal dy_norm = 1.0, BoutReal dz_norm = 1.0) {
coords_container.normalise(length_norm, dx_norm, dy_norm, dz_norm);
}
CoordinatesContainer coords_container;
}
class CoordinatesContainer{
public:
CoordinatesContainer(Options& grid_options) {
addCoordinates(CELL_CENTRE, grid_options);
addCoordinates(CELL_XLOW, grid_options);
addCoordinates(CELL_YLOW, grid_options);
addCoordinates(CELL_ZLOW, grid_options);
}
void normalise(BoutReal length_norm, BoutReal dx_norm, BoutReal dy_norm, BoutReal dz_norm) {
for (auto &coords : coords_map) {
coords.normalise(length_norm, dx_norm, dy_norm, dz_norm);
}
}
Coordinates& get(CELL_LOC location) {
# if CHECK > 1
return coords_map.at(location);
# else
return coords_map[location];
# endif
}
private:
/// Constructs and adds Coordinates at location if variables for location are
/// present in grid_options.
addCoordinates(CELL_LOC location, Options &grid_options);
std::map<CELL_LOC, Coordinates> coords_map;
}
I'd like to have ways of making coordinates without a grid file, if the user of the code (or unit test) passes in the pieces needed.
If we do like @ZedThree suggested and open grid files as an Options
, then it'd be easy to get them from the input file instead. Keep a separate constructor to set all the fields explicitly for unit tests, like Coordinates
already have.
If the Coordinates(Options&)
constructor just pulls out the relevant bits and delegates to the constructor that takes everything, this allows us to keep just one "main" constructor.
I'd probably recommend dropping Mesh::normaliseCoordinates
from that design, but otherwise sounds pretty sensible to me.
It would be really nice to be able to get rid of all the interpolation/extrapolation stuff, but I do have a small worry that removing it would make using staggered grids significantly harder. What if we pulled it out of Coordinates
into a free function? Perhaps also only allowing going from CELL_CENTRE
to staggered, and not vice versa?
It would be really nice to be able to get rid of all the interpolation/extrapolation stuff, but I do have a small worry that removing it would make using staggered grids significantly harder. What if we pulled it out of Coordinates into a free function? Perhaps also only allowing going from CELL_CENTRE to staggered, and not vice versa?
:+1:
I was thinking that with either hypnotoad2 grid files (it'll be easy to add CELL_XLOW output) or input file expressions (which can already be evaluated at any location) there's no problem with support, but I'd agree that a free function that interpolates/extrapolates from CELL_CENTRE coordinates to staggered would be nice to have. If it's just something users can call if they want it, we keep all the complexity out of the constructors, and I think the free function should be simpler that what we currently have because it'll know exactly what it needs to do (so few branches to consider).
I'd probably recommend dropping Mesh::normaliseCoordinates from that design, but otherwise sounds pretty sensible to me.
How would you handle normalising Coordinates
objects at all locations? I'm trying to avoid the need for user code like
mesh->getCoordinates(CELL_CENTRE).normalise(rho_s, B0/SQ(rho_s));
mesh->getCoordinates(CELL_YLOW).normalise(rho_s, B0/SQ(rho_s));
mesh->getCoordinates(CELL_XLOW).normalise(rho_s, B0/SQ(rho_s));
mesh->getCoordinates(CELL_ZLOW).normalise(rho_s, B0/SQ(rho_s));
We could add a getter for coords_container
, but if normalise()
is the only method we ever need to call through it, it seems simpler to add Mesh::normaliseCoordinates()
instead of the getter. It's true that would require us to be sure CoordinatesContainer
never gets more functionality - I wouldn't want to add 5 new methods to Mesh
!
Another thought - I put member variables and references rather than shared/unique pointers above, but that doesn't allow for staggered Coordinates
not being created if they're not needed, so I think I'd want CoordinatesContainer::coords_map
to be a std::map<CELL_LOC, std::unique_ptr<Coordinates>>
instead (unique_ptr
because the CoordinatesContainer
should always own the Coordinates
that it contains).
If the Coordinates(Options&) constructor just pulls out the relevant bits and delegates to the constructor that takes everything, this allows us to keep just one "main" constructor.
That 'takes everything' constructor would have to have a lot of arguments (4x as many as the Coordinates
one, to have a field for each location)! Maybe an alternative would be to delegate everything to the Coordinates
constructors. We could have a constructor like the following to allow constructing the Coordinates
explicitly and then pass them into the CoordinatesContainer
CoordinatesContainer(std::unique_ptr<Coordinates> centre, std::unique_ptr<Coordinates> xlow, std::unique_ptr<Coordinates> ylow, std::unique_ptr<Coordinates> zlow) {
if (centre != nullptr) {
coords_map[CELL_CENTRE] = centre;
}
if (xlow != nullptr) {
coords_map[CELL_XLOW] = xlow;
}
if (ylow != nullptr) {
coords_map[CELL_YLOW] = ylow;
}
if (zlow != nullptr) {
coords_map[CELL_ZLOW] = zlow;
}
}
How would you handle normalising
Coordinates
objects at all locations? I'm trying to avoid the need for user code likemesh->getCoordinates(CELL_CENTRE).normalise(rho_s, B0/SQ(rho_s)); mesh->getCoordinates(CELL_YLOW).normalise(rho_s, B0/SQ(rho_s)); mesh->getCoordinates(CELL_XLOW).normalise(rho_s, B0/SQ(rho_s)); mesh->getCoordinates(CELL_ZLOW).normalise(rho_s, B0/SQ(rho_s));
We could add a getter for
coords_container
, but ifnormalise()
is the only method we ever need to call through it, it seems simpler to addMesh::normaliseCoordinates()
instead of the getter. It's true that would require us to be sureCoordinatesContainer
never gets more functionality - I wouldn't want to add 5 new methods toMesh
!
Good point! I'm just wary about adding methods to Mesh
, everything tends to end up in there :) Without Mesh::normaliseCoordinates
, we'd still need Mesh::getCoordinatesContainer
, so no gain there -- unless Mesh::coordinates_container
was just a public member?
Another option would be for Coordinates
to have a pointer to its parent container, and have Coordinates::normaliseAll
or something. I'm not sure that's a better option!
CoordinatesContainer
having a map is a good idea -- maybe even just std::array
and indexed with the enum? Not 100% sure that's any better either.
unique_ptr
is probably not needed, just store a value? CoordinatesContainer
then still owns them and clean them up after, and we'd need to either move or copy the Coordinates
in, which is not a massive burden.
unique_ptr
would be useful if we want to store derived classes and have CoordinatesContainer
own them
Some thoughts on constructing Coordinates
: we can't get around the fact that calculating the connection coefficients requires taking derivatives of other Coordinates
quantities, so we have some inescapable self-reference/circular dependency. I can see three different ways of dealing with this:
Two-phase construction. After 3D metrics goes in, this is what we'll have. Calculate the troublesome quantities in a separate function call after the main constructor. This makes constructing a Coordinates
a little difficult, or at least has the potential for bugs. We can also mitigate this somewhat by providing a function to create a Coordinates
, currently done with Mesh::getCoordinates/getCoordinatesSmart
. We could also provide a free function for the most typical use-case of constructing tokamak/BOUT++ coordinates.
Shims/proxies. This is what we currently use, until 3D metrics goes in. We reimplement the functions that would call getCoordinates
during the Coordinates
constructor as member functions of Coordinates
. Lots of code duplication.
Caching results in getters. We only calculate things on the first call to the getter, and deallocate dependencies on calls to setters. This avoids the issue of needing to call multiple functions to get Coordinates
in a usable state, at the cost of needing mutable
members in order to set them in the const
getters.
The last option appeals to me the most, so that Coordinates
is a "normal" type that can be constructed and be in a usable state immediately. On the other hand, we don't really expect users to be calling the constructor themselves, and we already have the wrapper function, so option 1 is definitely not unreasonable.
Another design choice is to do with coordinate values. For non-uniform boundary conditions, as well as interfacing with external codes, it will be very useful to have fields with the x, y, z coordinates. For the boundaries, the coordinates in the same coordinate system should be sufficient, but for external codes we may prefer to have lab-space/Cartesian coordinates as well.
We've talked several times about caching some derived quantities like 1. / dx
, I've hacked this in and looked just at blob2d: 100 timesteps on 16 cores we could save 5-10%. There are also further quantities like sqrt(g*)
that are used in a fair few places that might also be worth caching. This is probably enough savings that it's worth investigating further.
Here's my proposal for a Coordinates
redesign:
class Coordinates {
struct GridSpacing {
Field2D dx, dy; ///< Mesh spacing in x and y
BoutReal dz; ///< Mesh spacing in Z
};
struct InverseGridSpacing {
Field2D one_over_dx, one_over_dy;
BoutReal one_over_dz;
};
/// Also sets inverse grid spacing
void setGridSpacing(const GridSpacing& spacing);
const GridSpacing& getGridSpacing() const;
const InverseGridSpacing& getInverseGridSpacing() const;
/// Contravariant metric tensor (g^{ij})
struct ContravariantMetric {
Field2D g11, g22, g33, g12, g13, g23;
};
/// Covariant metric tensor
struct CovariantMetric {
Field2D g_11, g_22, g_33, g_12, g_13, g_23;
};
/// Choice of either immediately calculating the other metric,
/// or invalidating it to be calculated and cached by getter.
///
/// Always invalidate the Christoffel symbols
void setContraMetric(const ContravariantMetric& metric);
void setCoMetric(const CovariantMetric& metric);
/// Set both metrics immediately
void setAllMetric(const ContravariantMetric& contra_metric_in,
const CovariantMetric& co_metric_in);
const ContravariantMetric& getContraMetric() const;
const CovariantMetric& getCoMetric() const;
/// Christoffel symbol of the second kind (connection coefficients)
struct Christoffel {
Field2D G1_11, G1_22, G1_33, G1_12, G1_13, G1_23;
Field2D G2_11, G2_22, G2_33, G2_12, G2_13, G2_23;
Field2D G3_11, G3_22, G3_33, G3_12, G3_13, G3_23;
Field2D G1, G2, G3;
};
/// Return the Christoffel symbols, calculating them if they don't exist
const Christoffel& getChristoffel() const {
if (not christoffel) {
calcChristoffel();
}
return *christoffel;
}
private:
GridSpacing spacing;
InverseGridSpacing inverse_spacing;
ContravariantMetric contra_metric;
CovariantMetric co_metric;
/// Christoffel symbol of the second kind (connection coefficients)
///
/// mutable so that we can calculate it on demand in getter,
/// unique_ptr so we can tell if its value is up-to-date.
///
/// Other options instead of unique_ptr:
/// - std::optional -- needs C++17
/// - just use value and check `G11.isAllocated()` -- but no way to deallocate?
mutable std::unique_ptr<Christoffel> christoffel{nullptr};
/// Calculate the Christoffel symbols from the metric tensor
///
/// Note: marked const so that it can be called from
/// `getChristoffel`, but this does modify `christoffel`
void calcChristoffel() const;
};
Upgrading code to use this isn't too hard, "just" stick in the appropriate getter, e.g.:
// If global flag all_terms are set (true by default)
if (all_terms) {
- coef4 = coords->G1(x,y); // X 1st derivative
- coef5 = coords->G3(x,y); // Z 1st derivative
+ coef4 = coords->getChristoffel().G1(x,y); // X 1st derivative
+ coef5 = coords->getChristoffel().G3(x,y); // Z 1st derivative
ASSERT3(finite(coef4));
and similarly for all the getters. That's basically s/->G\(\d\)/getChristoffel().G\1/
which shouldn't be too hard to automate.
This also allows us to drop the DDX/Y/Z
reimplementations/shims in Coordinates
.
User code that sets grid spacing and metric tensors would probably look like:
+ Coordinates::ContravariantMetric contra_metric;
- coords->g11 = ...
+ contra_metric->g11 = ...
+ coords->setContraMetric(contra_metric);
Less easy to automate, but probably still possible?
It looks like, of the examples that set the metric, most use tokamak coordinates, and we could provide a helper function:
setTokamakCoordinates(coords, Rxy, Bxy, Bpxy, Btxy, hthe, I);
At the moment the
Coordinates::Coordinates()
constructor (plus some utility methods/functions) does a lot of interpolation, extrapolation and checking. This was useful to support new features, but has become more complicated and hard to maintain. It is also difficult for users to find and fix problems with grids if they arise. Extrapolation (especially) is still useful, because geqdsk files may not define the poloidal flux function far enough past the walls for a grid generator to create a grid with boundary points, or the grid may run into coils/singularities, etc.A proposal for discussion:
Coordinates::Coordinates()
so that it does no extrapolation or interpolation. Grid files are required to include boundary points, and if staggeredCoordinates
are created, the grid file must include metrics, etc. at the staggered grid locations. It should check for positivity, etc. as appropriate, and also require sane values in corner guard/boundary points.Desired features of the conversion tool:
hypnotoad-gui
would be ideal, that likehypnotoad
can also create an input file for reproducible command-line conversions.coordinates.cxx
, but would be easier to add new ones (e.g. to ensure positivity in problematic cases).Coordinates::Coordinates()
, e.g. negativeB
, etc. Ideally provide some options to fix the problems.dx
,hthe
/hy
).hypnotoad
? The stuff above would certainly always be useful for validation ofhypnotoad
outputs.See also #1472 - I think addressing that would be much nicer in a Python conversion tool than as more complication in
coordinates.cxx
.