Closed gdmcbain closed 3 years ago
I think this shouldn't be a huge task to implement. I'll keep it in mind for future coding sessions.
Not a huge task to implement, no, but maybe there is a preliminary question to think about: ‘tagged subsets of elements’ nschloe/meshio#290.
The current skfem.Mesh.load
looks for a key 'gmsh:physical'
in the meshio_type
and bnd_type
entries of the .cell_data
attribute of the meshio.Mesh
. These are interpreted as one-dimensional integer arrays of length equal to the number of meshio_type
and bnd_type
cells specified in the meshio.Mesh.cells
.
However, the .subdomains
and .boundaries
attributes of a skfem.Mesh
are different; they're Dict[str, ndarray]
, with the terms of the ndarray
values being indices of meshio_type
or bnd_type
belonging to the subdomain or boundary.
These two kinds of representations:
are discussed in nschloe/meshio#175, nschloe/meshio#290, nschloe/meshio#347, …
The current behaviour of meshio.read
is mostly the way it is because at the time it was developed, the only format in which meshes were generated with subdomains and boundaries was Gmsh's MSH 2 and that used the {cell} → {subdomain} representation. Gmsh has since released a new format, MSH 4.1 #41 #129, which I think is actually supposed to work the other way, {subdomain} → {subset of cells}, although I haven't fully understood its concept of 'entity'. But when Gmsh suddenly started generating MSH 4.1 by default which nothing but itself could read, I hastily added a facility to meshio to read it, but to more or less coerce it into the same data-structure as previously used for MSH 2, This involves conversion from the second representation to the first, lossily, which leads to problems like nschloe/meshio#524 when subdomains or boundaries aren't mutually exclusive (something which makes sense in 'multiphysics' problems).
So I'm wondering whether it might be better to avoid that conversion and see whether (already or in the future) meshio
can hold and write something closer to .subdomains
and .boundaries
. But then the problem might be that either meshio won't reread those files as having subdomains or boundaries or won't put them into the (unfortunately named) 'gmsh:physical'
entries which will mean that they aren't recognized by skfem.importers.meshio.from_meshio
.
That Mesh.external
isn't populated on loading #268 means that that can't be used to reread the subdomains and boundaries (assuming they're not unravelled back into the 'gmsh:physical'
representation).
In recent work reading and writing XDMF nschloe/meshio#599 nschloe/meshio#600, I noticed a couple of interesting features of the format that might be useful here:
<Grid GridType="SubSet">
: These might be able to store skfem.Mesh.subdomains
.<Set>
Type="Cell"
: Another way to store subdomains?Type="Face"
: For skfem.Mesh.boundaries
?I don't think either are supported in meshio yet, but that can be fixed.
Other advanced mesh formats (MED, Exodus II, Silo, …?) might have something similar but I'm less familiar with them.
I think the XDMF Face corresponds to a scikit-fem facet and is indexed as the facet of a cell rather than in a global array of facets, but that can be worked with via the .f2t
and .t2f
attributes of a skfem.Mesh
. This might mean (except perhaps on initial import from Gmsh MSH 2) dispensing with separate lower dimensional elements for boundaries.
(I've been pondering a comment in nschloe/meshio#571: ‘The devil invented mixed topologies.’)
Currently there is the possibility to use skfem.io.json
which saves boundaries and named subdomains. Does it cater your needs or do you consider it necessary to save these also to external formats?
I was thinking of an external format, yes:
I think this is a desirable long-term goal.
I made efforts on the subdomains part in XDMF but met intransigence in nschloe/meshio#622. I haven't completely abandoned that proposal but wearying of the repeated rejection did experiment with pyexodus; that was just preliminary but did seem promising.
I did have a different idea for this a while ago, if existing external formats proved too difficult. Leaning on another part of the Python standard library than JSON, I was looking at xdrlib, the idea being something as isomorphic as possible to skfem.io.json
but faster and with smaller files.
I got this idea from (libMesh/External packages/Utilities).
The XDR: External Data Representation Standard is a standard for the description and encoding of data. It is useful for transferring data between different computer architectures, and as such provides a very simple, portable approach for writing platform independent binary files. libMesh uses the Xdr class to provide a uniform interface to input/output operations for files in either XDR binary or ASCII text formats.
A minor hindrance with XDR and JSON is that they don't support numpy.ndarray
; it's not part of the standard library.
The thinking with pyexodus was probably not to wrap it for scikit-fem but to use it to learn how to save subdomains and boundaries (if indeed the Exodus format permits that) and then use the resulting Exodus files as examples for meshio, to see how much its Exodus writer and reader need extending.
The comment in #556 yesterday was:
So as well as meshio, does that imply an existing file format? meshio no native serialization format. It does have sort of an internal data structure but it's evolving and not completely uniform with respect to the supported input and output formats.
The .subdomain
attribute isn't too bad as it can be encoded as a dict of names with values either (i) bool-valued cell data or (ii) a list of cells belonging. (Or, more rudimentarily, as int-valued cell data as in MEDIT, Gmsh's MSH 2.2, &c.)
The .boundaries
attribute is trickier. Gmsh stores boundaries as separate lower dimensional cells and FEniCS actually wants two input mesh files (one for the domain cells, one for the boundary cells). skfem.Mesh
records .boundaries
as lists of facets of domain cells; this is more natural (in my opinion) but not all that well supported by external mesh file formats. Not MEDIT or Gmsh's MSH, maybe Exodus, MED, XDMF, ...? My impressions so far for these were that XDMF doesn't look all that well maintained (incomplete inconsistent specification, the main reference implementation is ParaView which is a viewer and so has different priorities to a finite element library), MED was overly bureaucratic (despite comprehensive documentation and maintenance), leaving perhaps Exodus as the most promising.
The main hurdle with Exodus is that I don't think meshio currently supports the its features that are needed here (see, e.g., nschloe/meshio#1031). I think it might be possible to get the required features into meshio, despite that having failed for XDMF; I think the Exodus specification is less ambiguous, which is what went wrong in nschloe/meshio#622. There is also pyexodus and the possibility of reading Exodus directly as HDF.
Of course, whether meshio is used or not, relying on Exodus would introduce a dependency on h5py
or similar.
Are there other options that I've overlooked?
Oh yes, I had forgotten this discrepancy.
skfem.Mesh records .boundaries as lists of facets of domain cells; this is more natural (in my opinion) but not all that well supported by external mesh file formats.
The missing support is probably because there is not a unique way of mapping internal elements to facets. We have one implementation in scikit-fem but other codes might do it differently and some additional details must be fixed to have a well defined format.
So again it comes back to inventing our own format.
Now that the new BaseMesh class in #556 will be "immutable", there should be no issues of simply dumping the contents into HDF5 directly using h5py
(thus inventing our own format). That doesn't help with interoperability but should solve the issue of large size of JSON files.
The missing support is probably because there is not a unique way of mapping internal elements to facets.
True, but also true of the ordering of nodes in some elements , which is handled as in
The XDMF spec says:
Edge and Face values are defined for each Cell in turn, using VTK ordering.
That's for an Attribute
with Center="Face"
but I would assume also applied to a Set
of SetType="Face"
.
(I didn't know VTK had such an ordering. That might be worth looking into too.)
(I didn't know VTK had such an ordering. That might be worth looking into too.)
I had a bit of a look for whether VTK had a way of serializing data on facets but didn't find anything in:
Some thoughts provoked by the difficulties encountered in using meshio in #680:
I wonder whether we might do better by avoiding meshio's cell_sets
entirely and encoding our subsets, .boundaries
, and .subdomains
, with indicator functions, i.e. a bool
or 0/1 for inclusion. Pass these as cell_data
, named like "boundary:left"
. This isn't optimally efficient in terms of storage but might be manageable. (Do we need to prepend this more? Like "skfem:boundary:left"
?)
Extension of that idea: some mesh formats (notably XDMF—see xdmf#15 nschloe/meshio#568, nschloe/meshio#622) don't like storing more than one kind of element. Could we get away from Gmsh's idea of representing boundary facets as lower-order elements and instead in the cell_data
for each boundary store an int
array with an int
for each domain cell, being the index of the facet that is on the boundary if there is one and otherwise a sentinel value (e.g. the number of facets, since that is not a valid index to the element–facet connectivity matrix, .t2f
.
Would such a scheme:
skfem.Mesh
? using np.nonzero
, &c.How does this compare with #680 and its objectives? I'm not exactly sure what those are. Shall I develop this idea?
Actually I'm not even sure that it needs support from the skfem
library; I think it can be done at the user-level (although that's true of much functionality in scikit-fem, which is one of its great advantages).
Could we get away from Gmsh's idea of representing boundary facets as lower-order elements and instead in the cell_data for each boundary store an int array with an int for each domain cell, being the index of the facet that is on the boundary if there is one and otherwise a sentinel value (e.g. the number of facets, since that is not a valid index to the element–facet connectivity matrix, .t2f.
No good: a cell can have more than one facet on a boundary.
Let me add a remark that in #680 the call
m = MeshTet().with_boundaries({'left': lambda x: x[0]==0.0})
m.save('test.msh', write_boundaries=True, sets_to_int_data=True, binary=False, file_format='gmsh22')
seems to produce completely reasonable results:
It's just that reading the resulting test.msh
causes the loss of the tags and adds spurious entries.
Here is the resulting msh
file:
$MeshFormat
2.2 0 8
$EndMeshFormat
$Nodes
8
1 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
2 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
3 0.0000000000000000e+00 1.0000000000000000e+00 0.0000000000000000e+00
4 1.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
5 0.0000000000000000e+00 1.0000000000000000e+00 1.0000000000000000e+00
6 1.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
7 1.0000000000000000e+00 1.0000000000000000e+00 0.0000000000000000e+00
8 1.0000000000000000e+00 1.0000000000000000e+00 1.0000000000000000e+00
$EndNodes
$Elements
17
1 4 2 0 0 1 2 3 4
2 4 2 0 0 4 6 2 8
3 4 2 0 0 3 4 7 8
4 4 2 0 0 3 4 2 8
5 4 2 0 0 2 3 5 8
6 2 2 0 0 1 2 3
7 2 2 0 0 1 2 4
8 2 2 0 0 1 3 4
9 2 2 0 0 2 3 5
10 2 2 0 0 2 4 6
11 2 2 0 0 2 5 8
12 2 2 0 0 2 6 8
13 2 2 0 0 3 4 7
14 2 2 0 0 3 5 8
15 2 2 0 0 3 7 8
16 2 2 0 0 4 6 8
17 2 2 0 0 4 7 8
$EndElements
$ElementData
1
"left"
1
0.0
3
0
1
17
1 -1
2 -1
3 -1
4 -1
5 -1
6 0
7 -1
8 -1
9 0
10 -1
11 -1
12 -1
13 -1
14 -1
15 -1
16 -1
17 -1
$EndElementData
MSH 4.1 is really confusing because the nodes must be assigned to geometric entities with some dimension (gmsh:dim_tags
). I suppose the idea would be to assign them to edges, points and such parts of the original geometry. Obviously the original geometry is missing when you are dealing with meshes only.
Yes... One plan that i did have, but never tried, was to take a simple MSH 2.2 like that and ask Gmsh to update it to MSH 4.1. I hoped that that would shed light on what on Earth one is supposed to posit for entities.
Another shortcoming with MSH 2.2 (and the simple integer tagging of cells in general, e.g. also in MEDIT) is the inability to handle overlapping subdomains or boundaries; e.g. consider a slight extension of your example:
m = MeshTet().with_boundaries(
{
"left": lambda x: x[0] == 0.0,
"vertical": lambda x: np.logical_or(x[0] == 0.0, x[0] == 1.0),
}
)
(e.g. for a free convection problem, this might be a plane vertical channel with zero-velocity on both walls and a Dirichlet temperature or applied heat flux on the left).
When I look at m.boundaries
, it contains (correctly)
{'left': array([0, 4]), 'vertical': array([ 0, 4, 12, 13])}
but in the resulting MSH 2.2, the two subsets are combined
$ElementData
1
"left-vertical"
1
0.0
3
0
1
17
1 -1
2 -1
3 -1
4 -1
5 -1
6 1
7 -1
8 -1
9 1
10 -1
11 -1
12 -1
13 -1
14 -1
15 -1
16 1
17 1
$EndElementData
This can be avoided by the user never specifying overlapping subdomains or boundaries, e.g. here splitting "right"
from "vertical"
and applying the no-slip condition to both "left"
and "right"
, but ideally the user wouldn't have to do it and it is a shortcoming of MSH 2.2, recognized by the Gmsh developers. If overlapping physical entities are specified in Gmsh (which is its native way of handling subdomains and boundaries, rather than this ElementData
trick), it does something really unexpected when writing it as MSH 2.2: it duplicates the elements giving each coincident element one of the physical tags!
Extending the example in nschloe/meshio#1165 with a third element of cell_sets
, a subset of the second
cell_sets={
'test': [array([], dtype=int64), array([1])],
'sets': [array([0, 1]), array([0, 2, 3])],
'subs': [array([1]), array([2, 3])],
}
leaves only the set-difference in 'sets'
after the round-trip:
'sets': [array([0]), array([0])],
while extending that third subset beyond the second
'subs': [array([1]), array([1, 2, 3])],
yields something entirely different:
{'set1': [array([0]), array([0])], 'set2': [array([1]), array([1, 2, 3])]}
This is expected I think. That's how you'd also specify overlapping boundary conditions in ElmerFEM. Boundary is split into three: 1 for BC1, 2 for BC2 and 3 for BC1&BC2. Then in the simulation spec you set BC1 to tags 1&3 and BC2 to tags 2&3.
However, in this PR I plan to add yet another approach of using multiple cell_data keys. I don't know if MSH 2.1 supports that but it's worth a try.
Another thing I wan't to try is writing this gmsh:dim_tags so that MSH 4.1 export doesn't crash.
Maybe an improvement to int_data_to_sets / sets_to_int_data would detect or deal properly with the overlap. Right now the user must be aware that there cannot be such overlapping cell_sets. But I won't push that because it's not needed in what I have in mind.
There is also the question that where do these overlapping cell_sets originate from? In particular, if there are no importers in meshio that write overlapping cell_sets then that's probably not very high priority. If you can construct, e.g., inp-file which leads to overlapping cell_sets then it should be fixed.
I now see that, e.g., Medit format is similar to Elmer format because it carries no tag names. It is possible to tag each edge with a number but there is no name associated with the number so therefore int_data_to_sets
must invent some names.
Now I'm looking into creating our own variant of sets_to_int_data
which will support overlapping domains. Everything is encoded into a single cell_data
entry for maximum support over different formats.
There is also the question that where do these overlapping cell_sets originate from? In particular, if there are no importers in meshio that write overlapping cell_sets then that's probably not very high priority.
No, meshio reads MSH 4.1 fine and MSH 4.1 does describe subdomains and boundaries as subsets of cells. The meshio reader does populate .cell_sets
directly in this case.
I use overlapping subdomains and boundaries a lot. I usually generate the mesh in Gmsh, export as MSH 4.1, read with meshio, take into skfem, export results as XDMF. That's quite a good one way pipeline but it fails if i have to go backwards: the XDMF has forgotten the subdomains and boundaries.
However, in this PR I plan to add yet another approach of using multiple cell_data keys. I don't know if MSH 2.1 supports that but it's worth a try.
2.2 or 4.1? But anyway yes, both support an arbitrary number of ElementData
fields, corresponding to cell_data
in meshio.
Everything is encoded into a single cell_data entry for maximum support over different formats.
Intrigued!
So that was kind of achieved in #681 with the binary-encoding, but I didn't really think it through or test it very thoroughly. Intrigued to see what you've come up with. The state of mesh file formats is really disappointing!
I'm planning to use products of prime numbers to distinguish the overlaps.
Ha. I did think of that. I have done it before. Is it better than the binary? I thought the binary was more compact.
Oh, well I was using it for the multiple facets of a cell on a boundary rather than a cell belonging to multiple subdomains, but yes.
Now it's kind of working locally and it seems I can encode subdomains/boundaries to almost any format, even VTK is working even though Paraview is not visualizing the boundary mesh! How crazy's that.
I'll clean it up a bit and push to #680
it seems I can encode subdomains/boundaries to almost any format, even VTK is working
Yes this is a really neat idea. I suppose it's only possible because of meshio, approximating as it does a neutral format.
Being able to visualise subdomains and boundaries is a bonus. I wonder whether we can get that to work. Is it not working because of that new cellblock list for cells in meshio 4? Does that force each cell-data field to be defined on both domain and boundary cells?
I think right now one should be able to open .msh
file in Gmsh to visualize. I need to figure out how to do it in Paraview, maybe by writing point_data
?
Yes, that should work.
In #681, which used only domain-cells, no boundary-cells, the boundaries were sort-of depicted by colouring those cells having a facet on the boundary. If now in #680 you do have boundary cells and domain cells, do you have to provide cell data for the domain cells even for boundaries? At the moment is that just getting zeros or similar? If so, and you have to put something there anyway, could you put something like what went into the domain cell data for boundaries in #681? That worked particularly well in three dimensions because only the boundary facets are visible anyway.
Domain cells are used to encode Mesh.subdomains
and boundary cells are used to encode Mesh.boundaries
.
Of course, but isn't it a quirk of meshio 4 that all cell data fields have to de defined over all cells? So that even if you only want to encode a boundary over the boundary cells, you also have to provide values for that field over the domain cells too? (And vice versa for subdomains.)
Yeah. Would you want to write cell_data
for domain cells only if Mesh.subdomains
is empty?
Or maybe you are suggesting that we write multiple cell_data
entries, one per each subdomain
and each boundaries
entry?
Is this just for visualizing the boundaries? I mean, using point_data
should do the trick right? So that we don't mix it up with tagging when using formats that support only one cell_data
entry (like Elmer, Medit, AVS, ...)?
Right, yes, i'm probably still thinking too much in terms of #681, sorry. I'll leave you to it and check back on Monday morning. It sounds like you're very close to fixing this which is great.
Should
Mesh.save
save the.subdomains
and.boundaries
attributes? In a way that they might be reread byMesh.load
.If I/O is to continue to rely on
meshio.write
andmeshio.read
(as it should), I think that saving boundaries requires saving cells of typebnd_type
(e.g.'line'
forMeshQuad
); currentlyskfem.Mesh.save
only saves cells of typeMesh.meshio_type
.(I'm currently parsing a
polyMesh
from OpenFOAM and constructing anskfem.Mesh
. My parser is very slow so I want to be able to save the mesh while experimenting with the solver.)Perhaps instead for the moment I should just construct a
meshio.Mesh
andmeshio.write
that, leaving scikit-fem out of it, but this might be something to think about for other cases, e.g. when the mesh has been refined or adapted inside scikit-fem.