Responses&observation handling could be more flexible

yngve-sk commented 4 months ago

Currently different responses & observations are handled in a "hardcoded" way in storage. Now that observation parsing logic lives within response configs, it is within reach to make storage handle all responses & observations the same way. This opens up for adding custom response & observation types to ert, which can make it easier for users to specify responses/observations into ert, potentially (very likely) also with a lot less associations between multiple files.

Furthermore we also want ert storage to resemble that a database, where responses/observations are just 2 tables with the same primary key and some different value-columns, which are always joined into the S-matrix by what is equivalent to a simple database table join on a primary key. Getting responses to be generic at this level, where we just tell ert that we expect one table of responses with a certain primary key, and give some observations with that primary key, it should be able to combine/join these datasets regardless of what the primary key is, or exactly how the response file is output from ert, or the observations file is specified. Specifying these responses with "lifecycle hooks" gives the flexibility to accomodate different scenarios for how responses/observations come into the ert storage.

dafeda commented 4 months ago

"This opens up for adding custom response & observation types to ert, which can make it easier for users to specify responses/observations into ert, potentially (very likely) also with a lot less associations between multiple files."

Mind elaborating a bit here? What do you mean by custom response and observation types?

yngve-sk commented 3 months ago

Sure, most obvious example would be a specifying some observations as a CSV, either giving the name of the response explicitly in the ert config, or giving just the file and expecting it to contain a column with the response name/key for each observation. So one kind of custom observation type would be something that reads in from that CSV and matches it up to the correct responses so that they show up in the observations_and_responses matrix.

Each response has a "lifecycle" that you can hook into as a "response developer"

The "response developer" specifies which keywords it should add to the ert config grammar, so that they show up in the next step of the lifecycle: from_config_list
It is entered into the ert config by the user, the "response developer" writes the logic of how that line is turned into a response config, the ert config line is sent to .from_config_list
User runs ert forward model, ert puts stuff in runpath, for each realization we call .parse_response_from_config, which basically moves/transforms the data from the runpath to storage.

Beyond that, for each response/observation we specify a primary_key / join key / whatever it should be called, and this key is the only thing storage needs to know.

oyvindeide commented 1 month ago

Related to: #8369

equinor / ert

Responses&observation handling could be more flexible #8034