OpenEnergyPlatform / ontology

Repository for the Open Energy Ontology (OEO)
Creative Commons Zero v1.0 Universal
111 stars 23 forks source link

timestep, timehorizon, timeseries need new place and def #267

Closed akleinau closed 4 years ago

akleinau commented 4 years ago

Description of the issue

TimeStep, TimeHorizon and TimeSeries are currently variables and without a definition.

Ideas of solution

TimeStep: A TimeStep is a temporal region (?) stating the time between two calculations or measurements made.

TimeHorizon: A TimeHorizon is a temporal region (?) stating a specific point in time at which specific events will be reviewed or should end.

TimeSeries: A TimeSeries is a data set storing data indexed by time.

Workflow checklist

I am aware that

akleinau commented 4 years ago

delete to leave out in first release

Vera-IER commented 4 years ago

I don't like the term temporal region. I just googled it to see if thats something you can say in English and its actually a term for a part of the brain ;-) We could maybe use model property or model characteristic instead. So the def would be: A time step is a model property that describes the time period between two calculations or measurements made. etc.

akleinau commented 4 years ago

temporal region is an already implemented class of the bfo with the definition: "A temporal region is an occurrent entity that is part of time as defined relative to some reference frame". This fits in my opinion better than model property as it's already there and more specific about the main aspect of this concepts, the description of some time period

stap-m commented 4 years ago

I agree, classes time step and time horizon should be classified as 1-dimensional temporal regions. And they definitely need to be related to models / scenarios / time series.
Also, time step needs to be related to a quantity value (e.g. time step = 15 min).

akleinau commented 4 years ago

on further thinking I think time step should actually be a quantity value? It is just a description of a portion of time, no fixed moment in time

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Also, time step needs to be related to a quantity value (e.g. time step = 15 min).

It is just a description of a portion of time, no fixed moment in time

I think there are two perspectives on time steps: on the one hand a description of a model or scenario (e.g. "how many time steps of what duration does this model calculate?"), and on the other hand the description of data (e.g. "to which time step does this specific datum belong?"). Since both need to be accommodated, the model description either needs to list all time steps that are calculated -- tedious for 35040 15-minute time-slices to a year -- or we need two different "time step" classes.

For the "data perspective", a time step is defined by start and end, because it is a fixed moment in time (or rather, a region of time). The denomination of the time steps is arbitrary and differs between fields. E.g. in meteorology, observations are designated by the end-time of the time interval (so for 15-minute time steps, 08:15 would be the observations between 08:00 and 08:15, if I remember this discussion correctly @carstenhoyerklick), the IPCC data sets for the Assessment Reports uses the mid-point (so 2035 refers to data from the beginning of 2033 to the end of 2037). So

I don't know what the crucial characteristics for the "model perspective" are. I guess number and size of time steps. I just want to point out that these need not be homogeneous. Our model uses five-, ten- and twenty-year time steps at the same time (because it is computationally cheaper and the temporal resolution is important for the short term, but not so much when you look 100 years out).

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Related to OpenEnergyPlatform/ontology#474.

akleinau commented 4 years ago

these are two perspectives, yes, one that looks at a single time step and one that looks at all time steps in a model. The underlying concept of „timestep“ remains the same though and should be treated as one concept. That one can have the start and end time you proposed which are used directly for your data perspective. It can also have a duration property, so to describe your model perspective we can just relate the model to the time steps, eg „model has x instances of the timestep class“ and those instances have y duration.

So

stap-m commented 4 years ago

on further thinking I think time step should actually be a quantity value? It is just a description of a portion of time, no fixed moment in time

No, I think time step itself is still a 1-dimensional temporal region. But we should to add further relations and quantity values. For example

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago
  • one class: Timestep with start, end, duration

This makes me cringe a little, because duration would be derivative to start time and end time, allowing for inconsistent definitions. Would there be a "generic" time step, that only has duration, but no position in time? What would be the use case for it?

akleinau commented 4 years ago

use case would be that instead of giving start end end time of every timestep used in a model when looking at the model view we can just state the duration (that is typically the same for all). Yes, I thought about the inconsistency problem too. Maybe just include start and duration? So leave end out?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

instead of giving start end end time of every timestep used in a model when looking at the model view we can just state the duration

One doesn't need a class for that. One could also just give the model number of time steps and duration of time step attributes.

(that is typically the same for all)

This domain ontology is a joint effort to represent the typical energy-system modelling context based on standard terminologies used by human experts in this field of research.

Fixed. 😎 But probably it makes sense to allow models to either just state the number and duration of time steps, or give a comprehensive list. FYI: there are power sector models using different time step setups depending on the data they run on.

Yes, I thought about the inconsistency problem too. Maybe just include start and duration? So leave end out?

I'm not sure if this has to be addressed in the ontology. Users will always find a way to define nonsensical data. If a "generic time step" having only duration proves useful, then go for it and make the attributes optional. If not, personally, I would default to start time and end time, but in the end it doesn't matter.

stap-m commented 4 years ago

Ok, here comes a differentiation to not mix up time series and time step: We need time series with

We need time step with

Do you agree?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Do you agree?

I for one do in principle agree, but this is not sufficient.

tl;dr: Time series are not necessarily homogenous (composed of time steps of identical duration), nor continuous. Therefore a time series is not always completely identified by start time, number of time steps, and time step duration.

  • has quantity value start time and (optional?) end time
  • has quantity value number of time steps

If "has part some time step" links to only one "generic" time step (with duration, but without start time or end time), this presupposes homogeneous time steps, which is not always the case.

A concrete example: the REMIND model uses 19 time steps that vary between five and 20 years in duration. We don't care about the precise start and end times and durations in our work, especially not down to the second, but this would be the formal definition.

number period start time end time duration
1 2005 2003-01-01 00:00:00 UTC 2008-01-01 00:00:00 UTC 5 years
2 2010 2008-01-01 00:00:00 UTC 2013-01-01 00:00:00 UTC 5 years
3 2015 2013-01-01 00:00:00 UTC 2018-01-01 00:00:00 UTC 5 years
4 2020 2018-01-01 00:00:00 UTC 2023-01-01 00:00:00 UTC 5 years
5 2025 2023-01-01 00:00:00 UTC 2028-01-01 00:00:00 UTC 5 years
6 2030 2028-01-01 00:00:00 UTC 2033-01-01 00:00:00 UTC 5 years
7 2035 2033-01-01 00:00:00 UTC 2038-01-01 00:00:00 UTC 5 years
8 2040 2038-01-01 00:00:00 UTC 2043-01-01 00:00:00 UTC 5 years
9 2045 2043-01-01 00:00:00 UTC 2048-01-01 00:00:00 UTC 5 years
10 2050 2048-01-01 00:00:00 UTC 2053-01-01 00:00:00 UTC 5 years
11 2055 2053-01-01 00:00:00 UTC 2058-01-01 00:00:00 UTC 5 years
12 2060 2058-01-01 00:00:00 UTC 2065-07-02 12:00:00 UTC 7.5 years
13 2070 2065-07-02 12:00:00 UTC 2075-07-02 12:00:00 UTC 10 years
14 2080 2075-07-02 12:00:00 UTC 2085-07-02 12:00:00 UTC 10 years
15 2090 2085-07-02 12:00:00 UTC 2095-07-02 12:00:00 UTC 10 years
16 2100 2095-07-02 12:00:00 UTC 2105-07-02 12:00:00 UTC 10 years
17 2110 2105-07-02 12:00:00 UTC 2120-07-02 12:00:00 UTC 15 years
18 2130 2120-07-02 12:00:00 UTC 2140-07-02 12:00:00 UTC 20 years
19 2150 2140-07-02 12:00:00 UTC 2167-07-02 12:00:00 UTC 27 years

Other IAMs do this differently. E.g.

The carbon price for a given model year t is usually assumed to be constant over the length of the time step Δt (either from time t-1 to t or from t-Δt/2 to t+Δt/2, depending on the model).

(From the Model Diagnostic Exercise – Study Protocol of the ADVANCE Project)

The Assessment Report data of the IPCC on the other hand is not specific about what a time step like 2030 actually refers to. But one interpretation (held by the people at IIASA, who are hosting the data) is that it denotes that specific year, in which case the time series is not continuous, but has ten-year gaps in-between:
2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041
Practically, it makes little difference and is interpolated away. But the ontology must be able to represent it.

We need time step with

  • has quantity value duration
  • optional has quantity value start time and end time

I still don't see the use for the generic time step, but sure.

--> and it lies in the responsibility of the users to not assign nonsense?

Sure.

stap-m commented 4 years ago

tl;dr: Time series are not necessarily homogenous (composed of time steps of identical duration), nor continuous. Therefore a time series is not always completely identified by start time, number of time steps, and time step duration.

I agree. Any ideas for further relations?

If "has part some time step" links to only one "generic" time step (with duration, but without start time or end time), this presupposes homogeneous time steps, which is not always the case.

Indeed. If the relation is has part some time step, then seveal, also inhomogenous, time steps could be assigned. Please correct, if I am getting this wrong @akleinau.

I still don't see the use for the generic time step, but sure.

What would be your choice?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

What would be your choice?

One could also just give the model number of time steps and duration of time step attributes.

stap-m commented 4 years ago

One could also just give the model number of time steps and duration of time step attributes.

So-called "attributes" (not really an ontological term, though) are implemented also as classes (often as dependent continuants) that are related via properties (e.g. has quantity value, has part, ...) to an independent continuant. See also wiki and OpenEnergyPlatform/oeo-extended#5. I don't know if it's even possible to implement one class numer of time steps. How to classify? "number of" is dependent on something, e.g. time steps. Same for "duration of". And, if you have generic classes number, time step, duration ,... you can also reuse it for other purposes.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

How do you plan on representing the hub height of a wind turbine (to use a popular example from the dev meetings)? Is there going to be a class for it, with subclasses for every possible hub height?

stap-m commented 4 years ago

Is there going to be a class for it, with subclasses for every possible hub height?

Yes, there is going to be a class hub hight, defining the concept of hub hight. But instead of having subclasses (or rather instances) of possible hub hight values, this class hub hight would be related via has quantity value to a quanity value length value (or hight value or however called) which is related per definition to a value (data property, e.g. 140) and a unit (e.g. metre). Whether the value instances of length value will be stored within the OEO or not has to be discussed. I can't answer that yet. But I'll put this question on the agenda for the developer meeting on thursday.

l-emele commented 4 years ago

Whether the value instances of length value will be stored within the OEO or not has to be discussed.

My understanding is that data like that will not be stored in the OEO but in the OEP database as one major use case of the OEO is to annotate the data in the OEP.

l-emele commented 4 years ago

Ping @christian-rli : Any thoughts?

akleinau commented 4 years ago

this discussion strayed away a bit from the original topic. Hub height got discussed and implemented last dev meeting. This should have cleared things up how classes are used in an ontology. So from that ontology perspective the explanation of @stap-m with genereric classes number, time step, duration... is the common and easy one.

So again the proposition of @stap-m:

We need time series with

has quantity value start time and (optional?) end time
has quantity value number of time steps
has part some time step

We need time step with

has quantity value duration
optional has quantity value start time and end time --> and it lies in the responsibility of the users to not assign nonsense?

and the question for further relations needed for time step to make it completely identifiable?

stap-m commented 4 years ago

I put some more thought into this:

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

So nay to process boundary, yay to everything else.

akleinau commented 4 years ago

s-depends on means specifically-depends-on, which means the process can't exist without the material entity. But I agree, then they are zero-dim temporal regions.

akleinau commented 4 years ago

so relations to the defs above:

@stap-m if you agree I would implement? Should I wait for the 1.1 release before starting?

stap-m commented 4 years ago

I agree. Further:

I am ok with implementing. But please leave this issue open. We didn't discuss time horizon and I guess time series is not yet finished.

carstenhoyerklick commented 4 years ago

Hi everybody, sorry for entering the discussion late, but we may also need to look at another term which may be important in this discussion, which is time stamp.

We discussed about start time and end time which acually is useful to make sure a time step is interpreted the right way. But many time series only have one time information. Meteorologist use in most cases the end of the interval (as mentioned above). So meteorological values with a time stamp of 12:00h and a time step of 1h usually describe a value which is an integral oder average from measurements between 11:01h and 12:00h (if it is minute data). Some data sets would use 11:30 for the 11-12h value, some use 11h. To be able to interprete a time series value, is very important to know how the time interval is acually defined in the data set. To have a complete information you either need:

The latter would apply if it is really an instaneous measurment (e.g. if you would only measre wind speed or radiation for a fraction of a second and don't care what happend inbetween)

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

To have a complete information you [need] time stamp, duration and a definition of time stamp (begin, middle,end, instantaneous)

Doesn't that all boil down to start time and end time?

Some data sets would use 11:30 for the 11-12h value, some use 11h.

To me that's just a label that is attached to the time step in the data set. It is up to the data provider to (by what ever means that should be done at some point) describe the "11:30" item in the data set to mean "a time step that has some part start time (11:01) and has some part end time (12:00)".

Concerning https://github.com/OpenEnergyPlatform/ontology/issues/362#issuecomment-693338836

Irradiation defenitely needs a time step over which it has been recorded.

I would argue that an irradiation measurement must mention the time interval it has been integrated over, in order to be comparable (or most likely be found incomparable) to other measurements. But a "time interval" (generic duration of time that can happen any‑when in time; not a term in the ontology) is different from a time step, which is a specific region in time, defined by beginning and end.

carstenhoyerklick commented 4 years ago

I would go with the the time step, as especially for irradiation you need to know when this time interval was or is, e.g. if you need to relate it to solar geometry.

Doesn't that all boil down to and ?

Some data sets would use 11:30 for the 11-12h value, some use 11h.

To me that's just a label that is attached to the time step in the data set. It is up to the data provider to (by what ever means >that should be done at some point) describe the "11:30" item in the data set to mean "a time step that has some part start time >(11:01) and has some part end time (12:00)".

In an ideal world yes, if we could force everyone to and that would be perfect. The second best ist let them define the and the

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

In an ideal world yes, if we could force everyone to and that would be perfect. The second best ist let them define the and the , then it would be equivalent. So it is alternatives.

🤔

In an ideal world yes, if we could force everyone to <start time> and <end time> 
that would be perfect. The second best ist let them define the <duration> and 
the <time stamp>, then it would be equivalent. So it is alternatives. 
  1. Don't make up your own HTML tags ;) (Or maybe tell your e-mail client not to.)

  2. But we can force them, by structuring the ontology in that way! Data providers will have to annotate their data set in any case, and hopefully will do so programmatically. Extracting start time and end time is only marginally more elaborate than extracting time stamp and duration,

    switch (timestamp_meaning) {
    case TS_BEGIN:
        start_time = timestamp;
        end_time   = timestamp + duration;
        break;
    
    case TS_END:
        start_time = timestamp - duration;
        end_time   = timestamp;
        break;
    
    case TS_MIDDLE:
        start_time = timestamp - duration / 2;
        end_time   = timestamp + duration / 2;
        break;
    
    case TS_INSTANTANEOUS:
        start_time = timestamp;
        end_time   = timestamp;
        break;
    }

and anybody who can't manage that will fail several times over in other areas with the ontology.

  1. (Arguing from a LOD-GEOSS perspective):
    If we were to have multiple definitions of time steps (start time and end time [SE] or time stamp, duration, meaning of time stamp [TsDM]), there would have to be a conversion between them on the Databus in any way, in order for users to extract data in their preverred format. So the [TsDM] → [SE] conversion could also be used up front, allowing easy uploading to the Databus, and not burdening the ontology with two equivalent yet different definitions.

I would go with the the time step, as especially for irradiation you need to know when this time interval was or is, e.g. if you need to relate it to solar geometry.

Good point. I yield to the expert ;)

carstenhoyerklick commented 4 years ago

Sorry I edited directly on GitHub, .. I misinterpreted the coding style.

I am mostly convinced, with the only exception, we can force for new data sets, but what do we do if want to annotate old existing data sets? Convert them to start time and end time and republish them?

carstenhoyerklick commented 4 years ago

p.s. also holds true for platforms that already regulary publish data as transparancy plattforms. We would need to force them to different publishing formats.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Convert them to start time and end time and republish them?

You seem to assume some automatic connection between the data set and the ontology. As I understand it, all these connections have to be explicitly states, so "old" and "new" makes no difference.

but what do we do if want to annotate old existing data sets

What do these data sets look like? I don't know, but I imagine something like

geographic reference time stamp Irradiation
some grid square 10:30 some number
some grid square 11:30 some number
some grid square 12:30 some number
... ... ...

with meta data attached specifying that "time stamp" means "the middle of the time step" and "time step duration is 1 hour". Then there has to be a link detailing the connection between the data item "11:30" and the ontology element "a time step that has some part start time (11:01) and has some part end time (12:00)."

I'm not sure how this connection is to be made (see LOD-GEOSS Redmine, maybe @Ludee can calrify), but I don't see any difference between "old" and "new" data sets. If the data set had the format

geographic reference start time end time Irradiation
some grid square 10:01 11:00 some number
some grid square 11:01 12:00 some number
some grid square 12:01 13:00 some number
... ... ... ...

there still would be to have a link saying "this specific line concerns "a time step that has some part start time (11:01) and has some part end time (12:00)."

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Also holds true for platforms that already regulary publish data as transparancy plattforms. We would need to force them to different publishing formats.

Maybe this might need a wider discussion or explanation on a OEO dev call. My understanding (that might be incorrect) is that the ontology is agnostic to the format data is in, but used to annotate the meaning of the data.

But if different formats are needed, isn't (automatic) republishing in "better" formats a core feature of the Databus? ;)

carstenhoyerklick commented 4 years ago

To my understanding, we may need all the definitions be able to anntoate all the data, also before republishing it. To be able to interprete the time information, we may need also the time stampand duration concepts for those who are not willing or able to include the start time and end time information in their data sets.

stap-m commented 4 years ago

time stamp definitely is a common term among energy system modelers, thus it should be part of the OEO. We had a discussion about time stamps when we implemented time series metadata for the OEP. For the metadata we soved it like this (here's the full example file):

"temporal": {
        "referenceDate": "2016-01-01",
        "timeseries": {
            "start": "2017-01-01T00:00+01",
            "end": "2017-12-31T23:00+01",
            "resolution": "1 h",
            "alignment": "left",
            "aggregationType": "sum"
        }

alignment means 11:01 (left) or 11:30 (centre) or 12:00 (right), referring to the above mentioned example. aggregation type could be sum/integrated, mean, instantaneous Maybe this could help for a solution.

What's also missing is a concept for time standards like UTC, CET, ...

carstenhoyerklick commented 4 years ago

I think this is a good way wich alignment and aggrgation type.

sfluegel05 commented 4 years ago

Currently we have (among others) the following axioms for time series:

In order to make time stamp usable we should replace them with (has part some start time and has part some end time) or (has part some time stamp and has part some alignment)

We also need definitions for the classes:

carstenhoyerklick commented 4 years ago

In General, it sounds very reasonable, only that the second option also needs a duration, so it would be something like ... or (has part some time stamp and has part some duration). Otherwise ´center allginment´ and ´right alligment´ are undefined.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

~time series~ time step (has part some start time and has part some ending time) or (has part some time stamp and has part some duration and has part some alignment)

sfluegel05 commented 4 years ago

~time series~ time step (has part some start time and has part some ending time) or (has part some time stamp and has part some duration and has part some alignment)

Yes, but also time series. We should add this relation for time step and time series.

akleinau commented 4 years ago

this issue has 42 comments. Maybe it is a good idea to define an upper limit like 30 comments, after which an issue should be discussed in a dev meeting as it got too complex?

carstenhoyerklick commented 4 years ago

I think, we are more or less done in this discussion. I think we may just call it to a close in the next dev meeting.

Ludee commented 4 years ago

OEO-TimeSeries

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Yes, but also time series. We should add this relation for time step and time series.

For start time and end time I agree. But does anybody use time stamps for entire time series?

sfluegel05 commented 4 years ago

For start time and end time I agree. But does anybody use time stamps for entire time series?

Maybe we can leave that part out at the moment. We can still implement it when we find someone who does use it.

After we discussed this issue in dev-meeting 10, I will implement the part concerning time stamp. Also, I suggest to open two new issues, one about aggregation which is needed to describe a time step and another about start time and end time which don't work the way they should at the moment.