equinor / fmu-dataio

FMU data standard and data export with rich metadata in the FMU context
https://fmu-dataio.readthedocs.io/en/latest/
Apache License 2.0
10 stars 15 forks source link

Support for faultdata on JSON format, related to DynaGeo #362

Closed jcrivenaes closed 7 months ago

jcrivenaes commented 1 year ago

This issue details and discuss/matures implementation of fault data initially made by DynaGeo internal link: https://dynageo.app.radix.equinor.com/). However the fault format itself is not strictly tied to DynaGeo. As this is JSON (~dictionary) format, the issue is related to #361 .

Fault format example

{
  "type": "FeatureCollection",
  "crs": {
    "type": "name",
    "properties": {
      "name": "EPSG::23032"
    }
  },
  "features": [
    {
      "type": "Feature",
      "properties": {
        "FaultID": "CF_C2",
        "HorizonID": "TopTilje322",
        "SegmentID": 0,
        "TriangleID": 0,
        "Juxtaposition": 0.0,
        "permeability_max": 0.54999997317791,
        .....
        "shalegougeratio_avg": 0.20000000298023224
      },
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              422238.404,
              7247491.896,
              2277.951
            ],
            [
              422241.016,
              7247492.562,
              2275.44
            ],
            [
              422231.777,
              7247516.006,
              2275.822
            ],
            [
              422238.404,
              7247491.896,
              2277.951
            ]
          ].
...

Each triangle (note that first and last coordinate are identical) have several properties. The files can become quite big.

jcrivenaes commented 1 year ago

The metadata shall contain at least related to to the data object:

Juxtaposition:
FW_Tilje321_Tilje322_Tilje323_HW_Tilje321_Tilje322_Tilje323

Parameters exported:
Juxtaposition
displacement_min
multiplier_avg
multiplier_min
permeability_avg
permeability_max
shalegougeratio_avg
shalegougeratio_min
shalesmearfactor_avg
shalesmearfactor_max
thickness_min
transmissibility_avg
transmissibility_min

Horizons:
TopTilje322
jcrivenaes commented 1 year ago

Proposal for metadata (draft!!!):

# Example metadata for a Fault data (here on JSON format) 

$schema: https://main-fmu-schemas-dev.radix.equinor.com/schemas/0.8.0/fmu_results.json
version: "0.8.0"
source: fmu

tracklog:
  - datetime: 2020-10-28T14:28:02
    user:
      id: peesv
    event: created
  - datetime: 2020-10-28T14:46:14
    user: 
      id: peesv
    event: updated

class: dictionary  # class is the main identifier of the data type.

# fmu:
# The fmu block in data objects have more sub-elements compared to ensemble objects.

fmu: # the fmu-block contains information directly related to the FMU context
  model:
    name: ff
    revision: 21.0.0.dev
    description:
      - detailed description
      - optional

  workflow:
    reference: rms/structural_model

  case:
    name: MyCaseName
    uuid: 8bb56d60-8758-481a-89a4-6bac8561d38e
    user:
      id: jriv # $USER from ERT
    description:
      - yet other detailed description
      - optional

  iteration:
    id: 0 # always an int, will be 0 for e.g. "pred"
    uuid: 4b939310-34b1-4179-802c-49460bc0f799
    name: "iter-0" # /"pred"
    restart_from: 15ce3b84-766f-4c93-9050-b154861f9100   # fmu.iteration.uuid for another iteration

  realization:
    id: 33
    uuid: 29a15b21-ce13-471b-9a4a-0f791552aa51 # hash of case.uuid + iteration.uuid + realization.id
    name: "realization-33"
    parameters: # directly pass parameters.txt. This is potentially a lot of content, only a stub is included here.
      SENSNAME: faultseal
      SENSCASE: low
      RMS_SEED: 1006
      INIT_FILES:
        PERM_FLUVCHAN_E1_NORM: 0.748433
        PERM_FLUVCHAN_E21_NORM: 0.782068
      KVKH_CHANNEL: 0.6
      KVKH_US: 0.6
      FAULT_SEAL_SCALING: 0.1
      FWL_CENTRAL: 1677

  context:
    stage: realization

file:
  relative_path: realization-33/iter-0/share/results/faults/faultx.json # case-relative
  absolute_path: /some/absolute/path//realization-33/iter-0/share/results/faults/faultsx.json
  checksum_md5: kjhsdfvsdlfk23knerknvk23  # checksum of the file, not the data.
  size_bytes: 132321

data: # The data block describes the actual data (e.g. surface). Only present in data objects

  content: faultplane  # white-listed and standardized

  # if stratigraphic, name must match the strat column. This is the official name of this surface.
  name: faultx
  stratigraphic: false  # if true, this is a stratigraphic surface found in the strat column

  # content-specific tag: When content == "field_outline", expect the 'field_outline' tag
  faultdata:
    name: FaultXX

  properties: # what the values actually show.
    - name: PropertyName
      attribute: goc
      is_discrete: false # to be used for discrete values in surfaces.
      calculation: null # max/min/rms/var/maxpos/sum/etc

  format: json
  unit: m
  vertical_domain: depth # / time / null
  depth_reference: msl # / seabed / etc # mandatory when vertical_domain is depth?
  spec:
    ntriangles: 9396
  bbox:
    xmin: 456012.5003497944
    xmax: 467540.52762886323
    ymin: 5926499.999511719
    ymax: 5939492.128326312
    zmin: 1244.039
    zmax: 2302.683
  is_prediction: true # A mechanism for separating pure QC output from actual predictions
  is_observation: false # Used for 4D data currently but also valid for other data?
  description:
    - Field outline calculated as intersection between top reservoir and the GOC
    - Made in a FMU workflow

display:
  name: FaultData
  subtitle: FaultX
  line:
    show: true
    color: red
    style: solid
  fill:
    show: false

access:
  asset:
    name: Drogon
  ssdl:
    access_level: internal
    rep_include: true
  classification: internal

masterdata:
  smda:
    country:
      - identifier: Norway
        uuid: ad214d85-8a1d-19da-e053-c918a4889309
    discovery:
      - short_identifier: DROGON
        uuid: 00000000-0000-0000-0000-000000000000 # mock uuid for Drogon
    field:
      - identifier: DROGON
        uuid: 00000000-0000-0000-0000-000000000000 # mock uuid for Drogon
    coordinate_system:
      identifier: ST_WGS84_UTM37N_P32637
      uuid: ad214d85-dac7-19da-e053-c918a4889309
    stratigraphic_column:
      identifier: DROGON_2020
      uuid: 00000000-0000-0000-0000-000000000000 # mock uuid for Drogon
perolavsvendsen commented 1 year ago

For other datatypes that have a content-specific subtag under data, e.g. field_outline, fluid_contact and others, we have strictly used the same phrasing. So in the proposal above, suggest using data.faultplane (instead of data.faultdata).

perolavsvendsen commented 1 year ago

Also, this looks like GeoJSON? Would it make more sense to be explicit with class: geojson? (Or would that be handled by data.format: geojson?)

jcrivenaes commented 1 year ago

Also, this looks like GeoJSON? Would it make more sense to be explicit with class: geojson? (Or would that be handled by data.format: geojson?)

Certainly; need to consider. Having only JSON is too general probably

perolavsvendsen commented 1 year ago

Should also probably consider lifting some key components out of the GeoJSON object and making it available in metadata. Example could be the geometry type. GeoJSON currently supports a specific list of geometry types, and I imagine that consumers would like to know before getting the data (?)

perolavsvendsen commented 1 year ago

Notes based on offline discussions with the data producer 08.09.2023:

Sounds something like this:

class: polygons
data.content: faultroom # <-- too specific? Worried about other, future faultseal products...
data.format: geojson

...and then a content-specific tag that carries the needed information from the source, e.g.

data.faultroom:
   parameters: [list of parameters],
   juxtaposition:
      fw: [list of faults]
      hw: [list of faults]
   etc: etc
   etc: etc # can be expanded as needed

According to the data producer, each geojson file contains multiple triangles repeating downwards, where each contains e.g. the list of parameters - but this is consistent through the file.

They currently export "metadata" next to the data files, which is an extract of some of these things. However, it is on plain text format not very suited for machine reading. They also communicate list of zones as an underscore-separated string, which will not work - zone names can (and frequently does) contain underscore. Suggested to get this in a more explicit format, for instance json.

perolavsvendsen commented 11 months ago

Status 18. Dec 2023, @ingunnr

Next steps probably:

perolavsvendsen commented 10 months ago

Discussions 15. Jan 2024

Next steps:

FaultRoom use case differ from other RMS-related use cases where the data is normally parsed directly from RMS, then exported. In this case, data is exported out of RMS by another process, and then fmu-dataio needs to read them back from disk in order to export them.

Current version of FaultRoom used by DynaGeo: 1.3.1rc5