softwareunderground / subsurface

Core data exchange library for subsurface science and engineering
Apache License 2.0
59 stars 19 forks source link

Proposed additional fundamental data type: Composite of elements? #28

Open fwkoch opened 3 years ago

fwkoch commented 3 years ago

Context

Within the current proposed data levels, there are:

This distinction is great, since it means if one software exports a fault, another software that knows nothing about "faults" can still import and use it as a surface element.

Problem

What about a geological_object that requires multiple elements to represent, say, a fault zone with multiple faults or a geological model with a surface for each formation? In these cases, each component of the object is still an element, but really these elements must be taken together to have meaning.

Currently, when importing a geological model into software that knows nothing about "geological models" the best we can do is import it as a bunch of separate, unrelated surface elements... And even that is dicey because there is no standard way to locate the elements within the unknown geological_object.

Solution 1: Be strict!

One option is we just require geological_objects to be elements; faults or geological_formations are ok, but fault_zones or geological_models are not. Then, we leave it up to the client to sort and associate these element with whatever arbitrary metadata makes sense on a case-by-case basis.

Solution 2: Composite data type

Another option is introduce an additional data type that is a composite of elements. The composite would have no geological meaning; it is just a way to bucket multiple elements together. Just like elements, though, composites are fundamental and must be supported on software import.

Looking back at the geological model problem - the software that doesn't know about "geological models" now is able to interpret it as a composite. It knows where to find the child elements (say, under an elements key...?), and after import the different surface elements are still grouped together.

(This idea comes from OMF - it made the format much more flexible to support complex data types without introducing additional primitives: https://github.com/gmggroup/omf/blob/dev/omf/composite.py )

rowanc1 commented 3 years ago

A tiny bit more discussion here: https://github.com/gmggroup/omf/pull/88

In the OMF case, we were also attaching data at the composite level - which a few folks in that community got excited about. :)

Leguark commented 3 years ago

Thanks to open this discussion! For the solution 2 my question would be if we need to create a composite level between elements and geological_objects or simply geological_objects can contain either a single element or a set of them. I think this depends on what extra information is stored in a geological_object with respect the element object. In other words, what is the difference between a fault and a geological_formations, only the name, metadata?

My gut feeling - and following @rowanc1 advice - would be to define geological_object as anything that needs to compose at least one element and anything else, e.i. compose many elements or any combination of them, metadata, extra arrays, etc. The downside to this is that eventually we could end up with many different geological_objects defining for example faults in just an slightly different way. I am of the opinion of solving the problem when appears and if in one year there are 10 definitions of "faults" we can just look for the commonalities and create a fault_composite or something like that.

In any case, not strong opinion about this one, let's see what others think :)