timestep, timehorizon, timeseries need new place and def

akleinau commented 4 years ago

Description of the issue

TimeStep, TimeHorizon and TimeSeries are currently variables and without a definition.

Ideas of solution

TimeStep: A TimeStep is a temporal region (?) stating the time between two calculations or measurements made.

or: turn into properties has_temporal_resolution, has_number_of_time_steps

TimeHorizon: A TimeHorizon is a temporal region (?) stating a specific point in time at which specific events will be reviewed or should end.

TimeSeries: A TimeSeries is a data set storing data indexed by time.

Workflow checklist

[ ] I discussed the issue with someone else than me before working on a solution
[ ] I already read the latest version of the workflow for this repository
[x] I added this issue to the Project 'Issues'. If suitable, I add it to further Projects.
[ ] The goal of this ontology is clear to me

I am aware that

[ ] every entry in the ontology should have an annotation
[ ] classes should arise from concepts rather than from words
[ ] class or property names should follow the UpperCamelCase

akleinau commented 4 years ago

delete to leave out in first release

Vera-IER commented 4 years ago

I don't like the term temporal region. I just googled it to see if thats something you can say in English and its actually a term for a part of the brain ;-) We could maybe use model property or model characteristic instead. So the def would be: A time step is a model property that describes the time period between two calculations or measurements made. etc.

akleinau commented 4 years ago

temporal region is an already implemented class of the bfo with the definition: "A temporal region is an occurrent entity that is part of time as defined relative to some reference frame". This fits in my opinion better than model property as it's already there and more specific about the main aspect of this concepts, the description of some time period

stap-m commented 4 years ago

I agree, classes time step and time horizon should be classified as 1-dimensional temporal regions. And they definitely need to be related to models / scenarios / time series.
Also, time step needs to be related to a quantity value (e.g. time step = 15 min).

akleinau commented 4 years ago

on further thinking I think time step should actually be a quantity value? It is just a description of a portion of time, no fixed moment in time

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Also, time step needs to be related to a quantity value (e.g. time step = 15 min).

It is just a description of a portion of time, no fixed moment in time

I think there are two perspectives on time steps: on the one hand a description of a model or scenario (e.g. "how many time steps of what duration does this model calculate?"), and on the other hand the description of data (e.g. "to which time step does this specific datum belong?"). Since both need to be accommodated, the model description either needs to list all time steps that are calculated -- tedious for 35040 15-minute time-slices to a year -- or we need two different "time step" classes.

For the "data perspective", a time step is defined by start and end, because it is a fixed moment in time (or rather, a region of time). The denomination of the time steps is arbitrary and differs between fields. E.g. in meteorology, observations are designated by the end-time of the time interval (so for 15-minute time steps, 08:15 would be the observations between 08:00 and 08:15, if I remember this discussion correctly @carstenhoyerklick), the IPCC data sets for the Assessment Reports uses the mid-point (so 2035 refers to data from the beginning of 2033 to the end of 2037). So

time step [data perspective] would have two attributes
- start time
- end time

I don't know what the crucial characteristics for the "model perspective" are. I guess number and size of time steps. I just want to point out that these need not be homogeneous. Our model uses five-, ten- and twenty-year time steps at the same time (because it is computationally cheaper and the temporal resolution is important for the short term, but not so much when you look 100 years out).

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Related to OpenEnergyPlatform/ontology#474.

akleinau commented 4 years ago

these are two perspectives, yes, one that looks at a single time step and one that looks at all time steps in a model. The underlying concept of „timestep“ remains the same though and should be treated as one concept. That one can have the start and end time you proposed which are used directly for your data perspective. It can also have a duration property, so to describe your model perspective we can just relate the model to the time steps, eg „model has x instances of the timestep class“ and those instances have y duration.

So

one class: Timestep with start, end, duration
model with has exact x timesteps who have y duration

stap-m commented 4 years ago

on further thinking I think time step should actually be a quantity value? It is just a description of a portion of time, no fixed moment in time

No, I think time step itself is still a 1-dimensional temporal region. But we should to add further relations and quantity values. For example

time step has quantity value duration
and / or time step has quantity value start time and end time

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

one class: Timestep with start, end, duration

This makes me cringe a little, because duration would be derivative to start time and end time, allowing for inconsistent definitions. Would there be a "generic" time step, that only has duration, but no position in time? What would be the use case for it?

akleinau commented 4 years ago

use case would be that instead of giving start end end time of every timestep used in a model when looking at the model view we can just state the duration (that is typically the same for all). Yes, I thought about the inconsistency problem too. Maybe just include start and duration? So leave end out?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

instead of giving start end end time of every timestep used in a model when looking at the model view we can just state the duration

One doesn't need a class for that. One could also just give the model number of time steps and duration of time step attributes.

(that is typically the same for all)

This domain ontology is a joint effort to represent the typical energy-system modelling context based on standard terminologies used by human experts in this field of research.

Fixed. 😎 But probably it makes sense to allow models to either just state the number and duration of time steps, or give a comprehensive list. FYI: there are power sector models using different time step setups depending on the data they run on.

Yes, I thought about the inconsistency problem too. Maybe just include start and duration? So leave end out?

I'm not sure if this has to be addressed in the ontology. Users will always find a way to define nonsensical data. If a "generic time step" having only duration proves useful, then go for it and make the attributes optional. If not, personally, I would default to start time and end time, but in the end it doesn't matter.

stap-m commented 4 years ago

Ok, here comes a differentiation to not mix up time series and time step: We need time series with

has quantity value start time and (optional?) end time
has quantity value number of time steps
has part some time step

We need time step with

has quantity value duration
optional has quantity value start time and end time --> and it lies in the responsibility of the users to not assign nonsense?

Do you agree?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Do you agree?

I for one do in principle agree, but this is not sufficient.

tl;dr: Time series are not necessarily homogenous (composed of time steps of identical duration), nor continuous. Therefore a time series is not always completely identified by start time, number of time steps, and time step duration.

has quantity value start time and (optional?) end time

has quantity value number of time steps

If "has part some time step" links to only one "generic" time step (with duration, but without start time or end time), this presupposes homogeneous time steps, which is not always the case.

A concrete example: the REMIND model uses 19 time steps that vary between five and 20 years in duration. We don't care about the precise start and end times and durations in our work, especially not down to the second, but this would be the formal definition.

number	period	start time	end time	duration
1	2005	2003-01-01 00:00:00 UTC	2008-01-01 00:00:00 UTC	5 years
2	2010	2008-01-01 00:00:00 UTC	2013-01-01 00:00:00 UTC	5 years
3	2015	2013-01-01 00:00:00 UTC	2018-01-01 00:00:00 UTC	5 years
4	2020	2018-01-01 00:00:00 UTC	2023-01-01 00:00:00 UTC	5 years
5	2025	2023-01-01 00:00:00 UTC	2028-01-01 00:00:00 UTC	5 years
6	2030	2028-01-01 00:00:00 UTC	2033-01-01 00:00:00 UTC	5 years
7	2035	2033-01-01 00:00:00 UTC	2038-01-01 00:00:00 UTC	5 years
8	2040	2038-01-01 00:00:00 UTC	2043-01-01 00:00:00 UTC	5 years
9	2045	2043-01-01 00:00:00 UTC	2048-01-01 00:00:00 UTC	5 years
10	2050	2048-01-01 00:00:00 UTC	2053-01-01 00:00:00 UTC	5 years
11	2055	2053-01-01 00:00:00 UTC	2058-01-01 00:00:00 UTC	5 years
12	2060	2058-01-01 00:00:00 UTC	2065-07-02 12:00:00 UTC	7.5 years
13	2070	2065-07-02 12:00:00 UTC	2075-07-02 12:00:00 UTC	10 years
14	2080	2075-07-02 12:00:00 UTC	2085-07-02 12:00:00 UTC	10 years
15	2090	2085-07-02 12:00:00 UTC	2095-07-02 12:00:00 UTC	10 years
16	2100	2095-07-02 12:00:00 UTC	2105-07-02 12:00:00 UTC	10 years
17	2110	2105-07-02 12:00:00 UTC	2120-07-02 12:00:00 UTC	15 years
18	2130	2120-07-02 12:00:00 UTC	2140-07-02 12:00:00 UTC	20 years
19	2150	2140-07-02 12:00:00 UTC	2167-07-02 12:00:00 UTC	27 years

Other IAMs do this differently. E.g.

The carbon price for a given model year t is usually assumed to be constant over the length of the time step Δt (either from time t-1 to t or from t-Δt/2 to t+Δt/2, depending on the model).

(From the Model Diagnostic Exercise – Study Protocol of the ADVANCE Project)

The Assessment Report data of the IPCC on the other hand is not specific about what a time step like 2030 actually refers to. But one interpretation (held by the people at IIASA, who are hosting the data) is that it denotes that specific year, in which case the time series is not continuous, but has ten-year gaps in-between:
… 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 …
Practically, it makes little difference and is interpolated away. But the ontology must be able to represent it.

We need time step with

has quantity value duration

optional has quantity value start time and end time

I still don't see the use for the generic time step, but sure.

--> and it lies in the responsibility of the users to not assign nonsense?

Sure.

stap-m commented 4 years ago

tl;dr: Time series are not necessarily homogenous (composed of time steps of identical duration), nor continuous. Therefore a time series is not always completely identified by start time, number of time steps, and time step duration.

I agree. Any ideas for further relations?

If "has part some time step" links to only one "generic" time step (with duration, but without start time or end time), this presupposes homogeneous time steps, which is not always the case.

Indeed. If the relation is has part some time step, then seveal, also inhomogenous, time steps could be assigned. Please correct, if I am getting this wrong @akleinau.

I still don't see the use for the generic time step, but sure.

What would be your choice?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

What would be your choice?

One could also just give the model number of time steps and duration of time step attributes.

stap-m commented 4 years ago

One could also just give the model number of time steps and duration of time step attributes.

So-called "attributes" (not really an ontological term, though) are implemented also as classes (often as dependent continuants) that are related via properties (e.g. has quantity value, has part, ...) to an independent continuant. See also wiki and OpenEnergyPlatform/oeo-extended#5. I don't know if it's even possible to implement one class numer of time steps. How to classify? "number of" is dependent on something, e.g. time steps. Same for "duration of". And, if you have generic classes number, time step, duration ,... you can also reuse it for other purposes.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

How do you plan on representing the hub height of a wind turbine (to use a popular example from the dev meetings)? Is there going to be a class for it, with subclasses for every possible hub height?

stap-m commented 4 years ago

Is there going to be a class for it, with subclasses for every possible hub height?

Yes, there is going to be a class hub hight, defining the concept of hub hight. But instead of having subclasses (or rather instances) of possible hub hight values, this class hub hight would be related via has quantity value to a quanity value length value (or hight value or however called) which is related per definition to a value (data property, e.g. 140) and a unit (e.g. metre). Whether the value instances of length value will be stored within the OEO or not has to be discussed. I can't answer that yet. But I'll put this question on the agenda for the developer meeting on thursday.

l-emele commented 4 years ago

Whether the value instances of length value will be stored within the OEO or not has to be discussed.

My understanding is that data like that will not be stored in the OEO but in the OEP database as one major use case of the OEO is to annotate the data in the OEP.

l-emele commented 4 years ago

Ping @christian-rli : Any thoughts?

akleinau commented 4 years ago

this discussion strayed away a bit from the original topic. Hub height got discussed and implemented last dev meeting. This should have cleared things up how classes are used in an ontology. So from that ontology perspective the explanation of @stap-m with genereric classes number, time step, duration... is the common and easy one.

So again the proposition of @stap-m:

We need time series with

has quantity value start time and (optional?) end time
has quantity value number of time steps
has part some time step

We need time step with

has quantity value duration
optional has quantity value start time and end time --> and it lies in the responsibility of the users to not assign nonsense?

and the question for further relations needed for time step to make it completely identifiable?

stap-m commented 4 years ago

I put some more thought into this:

start time and end time sound either like process boundaries or zero-dim temporal regions. start time is a hmhmhm that indicates the beginning of a 1-dim temporal region ? And end time indicating the end, accordingly.
duration is a quantity value that can be related to both time step and time series. duration is a quantity value indicating the time span of a 1-dim temporal region, measured in a time unit. ?
time step: a time step is a 1-dim temporal region that has a start time and an endtime and thus a finite duration. ?
time series: time series is a data item that reference to a set of time steps or zero-dim temporal regions. ?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

If start time/end time were process boundaries, then there would have to be a process that is bound, and that would _"s-dependson" (whatever that means) "some material entity". Time series are clearly independent of material entities.
"Process" and "process boundaries" also have specific meanings within engineering (e.g. the boundary of a combustion process might be defined before or after condensation of exhaust). This collision of terms doesn't really affect the ontology, but might confuse potential users.

So nay to process boundary, yay to everything else.

akleinau commented 4 years ago

s-depends on means specifically-depends-on, which means the process can't exist without the material entity. But I agree, then they are zero-dim temporal regions.

akleinau commented 4 years ago

so relations to the defs above:

duration: is about some 1-dim temporal region and has unit some time unit
time step: has part some start time and has part some end time and has part some duration
time series: is about some (time step or zero-dim temporal region)

@stap-m if you agree I would implement? Should I wait for the 1.1 release before starting?

stap-m commented 4 years ago

I agree. Further:

Maybe better classify time series as data set (subclass of data item)?!
I'd add the same relations as for time step - they don't have to be used for annotation.

I am ok with implementing. But please leave this issue open. We didn't discuss time horizon and I guess time series is not yet finished.

carstenhoyerklick commented 4 years ago

Hi everybody, sorry for entering the discussion late, but we may also need to look at another term which may be important in this discussion, which is time stamp.

We discussed about start time and end time which acually is useful to make sure a time step is interpreted the right way. But many time series only have one time information. Meteorologist use in most cases the end of the interval (as mentioned above). So meteorological values with a time stamp of 12:00h and a time step of 1h usually describe a value which is an integral oder average from measurements between 11:01h and 12:00h (if it is minute data). Some data sets would use 11:30 for the 11-12h value, some use 11h. To be able to interprete a time series value, is very important to know how the time interval is acually defined in the data set. To have a complete information you either need:

start time and end time
time stamp, duration and a definition of time stamp (begin, middle,end, instantaneous)

The latter would apply if it is really an instaneous measurment (e.g. if you would only measre wind speed or radiation for a fraction of a second and don't care what happend inbetween)

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

To have a complete information you [need] time stamp, duration and a definition of time stamp (begin, middle,end, instantaneous)

Doesn't that all boil down to start time and end time?

Some data sets would use 11:30 for the 11-12h value, some use 11h.

To me that's just a label that is attached to the time step in the data set. It is up to the data provider to (by what ever means that should be done at some point) describe the "11:30" item in the data set to mean "a time step that has some part start time (11:01) and has some part end time (12:00)".

Concerning https://github.com/OpenEnergyPlatform/ontology/issues/362#issuecomment-693338836

Irradiation defenitely needs a time step over which it has been recorded.

I would argue that an irradiation measurement must mention the time interval it has been integrated over, in order to be comparable (or most likely be found incomparable) to other measurements. But a "time interval" (generic duration of time that can happen any‑when in time; not a term in the ontology) is different from a time step, which is a specific region in time, defined by beginning and end.

carstenhoyerklick commented 4 years ago

I would go with the the time step, as especially for irradiation you need to know when this time interval was or is, e.g. if you need to relate it to solar geometry.

Doesn't that all boil down to and ?

Some data sets would use 11:30 for the 11-12h value, some use 11h.

To me that's just a label that is attached to the time step in the data set. It is up to the data provider to (by what ever means >that should be done at some point) describe the "11:30" item in the data set to mean "a time step that has some part start time >(11:01) and has some part end time (12:00)".

In an ideal world yes, if we could force everyone to and that would be perfect. The second best ist let them define the and the , then it would be equivalent. So it is alternatives.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

In an ideal world yes, if we could force everyone to and that would be perfect. The second best ist let them define the and the , then it would be equivalent. So it is alternatives.

🤔

In an ideal world yes, if we could force everyone to <start time> and <end time> 
that would be perfect. The second best ist let them define the <duration> and 
the <time stamp>, then it would be equivalent. So it is alternatives.

Don't make up your own HTML tags ;) (Or maybe tell your e-mail client not to.)

But we can force them, by structuring the ontology in that way! Data providers will have to annotate their data set in any case, and hopefully will do so programmatically. Extracting start time and end time is only marginally more elaborate than extracting time stamp and duration,

switch (timestamp_meaning) {
case TS_BEGIN:
    start_time = timestamp;
    end_time   = timestamp + duration;
    break;

case TS_END:
    start_time = timestamp - duration;
    end_time   = timestamp;
    break;

case TS_MIDDLE:
    start_time = timestamp - duration / 2;
    end_time   = timestamp + duration / 2;
    break;

case TS_INSTANTANEOUS:
    start_time = timestamp;
    end_time   = timestamp;
    break;
}

and anybody who can't manage that will fail several times over in other areas with the ontology.

(Arguing from a LOD-GEOSS perspective):
If we were to have multiple definitions of time steps (start time and end time [SE] or time stamp, duration, meaning of time stamp [TsDM]), there would have to be a conversion between them on the Databus in any way, in order for users to extract data in their preverred format. So the [TsDM] → [SE] conversion could also be used up front, allowing easy uploading to the Databus, and not burdening the ontology with two equivalent yet different definitions.

I would go with the the time step, as especially for irradiation you need to know when this time interval was or is, e.g. if you need to relate it to solar geometry.

Good point. I yield to the expert ;)

carstenhoyerklick commented 4 years ago

Sorry I edited directly on GitHub, .. I misinterpreted the coding style.

I am mostly convinced, with the only exception, we can force for new data sets, but what do we do if want to annotate old existing data sets? Convert them to start time and end time and republish them?

carstenhoyerklick commented 4 years ago

p.s. also holds true for platforms that already regulary publish data as transparancy plattforms. We would need to force them to different publishing formats.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Convert them to start time and end time and republish them?

You seem to assume some automatic connection between the data set and the ontology. As I understand it, all these connections have to be explicitly states, so "old" and "new" makes no difference.

but what do we do if want to annotate old existing data sets

What do these data sets look like? I don't know, but I imagine something like

geographic reference	time stamp	Irradiation
some grid square	10:30	some number
some grid square	11:30	some number
some grid square	12:30	some number
...	...	...

with meta data attached specifying that "time stamp" means "the middle of the time step" and "time step duration is 1 hour". Then there has to be a link detailing the connection between the data item "11:30" and the ontology element "a time step that has some part start time (11:01) and has some part end time (12:00)."

I'm not sure how this connection is to be made (see LOD-GEOSS Redmine, maybe @Ludee can calrify), but I don't see any difference between "old" and "new" data sets. If the data set had the format

geographic reference	start time	end time	Irradiation
some grid square	10:01	11:00	some number
some grid square	11:01	12:00	some number
some grid square	12:01	13:00	some number
...	...	...	...

there still would be to have a link saying "this specific line concerns "a time step that has some part start time (11:01) and has some part end time (12:00)."

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Also holds true for platforms that already regulary publish data as transparancy plattforms. We would need to force them to different publishing formats.

Maybe this might need a wider discussion or explanation on a OEO dev call. My understanding (that might be incorrect) is that the ontology is agnostic to the format data is in, but used to annotate the meaning of the data.

But if different formats are needed, isn't (automatic) republishing in "better" formats a core feature of the Databus? ;)

carstenhoyerklick commented 4 years ago

To my understanding, we may need all the definitions be able to anntoate all the data, also before republishing it. To be able to interprete the time information, we may need also the time stampand duration concepts for those who are not willing or able to include the start time and end time information in their data sets.

stap-m commented 4 years ago

time stamp definitely is a common term among energy system modelers, thus it should be part of the OEO. We had a discussion about time stamps when we implemented time series metadata for the OEP. For the metadata we soved it like this (here's the full example file):

"temporal": {
        "referenceDate": "2016-01-01",
        "timeseries": {
            "start": "2017-01-01T00:00+01",
            "end": "2017-12-31T23:00+01",
            "resolution": "1 h",
            "alignment": "left",
            "aggregationType": "sum"
        }

alignment means 11:01 (left) or 11:30 (centre) or 12:00 (right), referring to the above mentioned example. aggregation type could be sum/integrated, mean, instantaneous Maybe this could help for a solution.

What's also missing is a concept for time standards like UTC, CET, ...

carstenhoyerklick commented 4 years ago

I think this is a good way wich alignment and aggrgation type.

sfluegel05 commented 4 years ago

Currently we have (among others) the following axioms for time series:

has part some start time
has part some ending time

In order to make time stamp usable we should replace them with (has part some start time and has part some end time) or (has part some time stamp and has part some alignment)

We also need definitions for the classes:

time stamp: A time stamp is a zero-dimensional temporal region that is used to describe a time series.
alignment (maybe time stamp alignment would be more clear): An alignment is a data descriptor that indicates the position of a time stamp in a time series. We could add left alignment, centre alignment and right alignment as Individuals and make them Instances of alignment (this would be analogous to data format and its instances)

carstenhoyerklick commented 4 years ago

In General, it sounds very reasonable, only that the second option also needs a duration, so it would be something like ... or (has part some time stamp and has part some duration). Otherwise ´center allginment´ and ´right alligment´ are undefined.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

~time series~ time step (has part some start time and has part some ending time) or (has part some time stamp and has part some duration and has part some alignment)

sfluegel05 commented 4 years ago

~time series~ time step (has part some start time and has part some ending time) or (has part some time stamp and has part some duration and has part some alignment)

Yes, but also time series. We should add this relation for time step and time series.

akleinau commented 4 years ago

this issue has 42 comments. Maybe it is a good idea to define an upper limit like 30 comments, after which an issue should be discussed in a dev meeting as it got too complex?

carstenhoyerklick commented 4 years ago

I think, we are more or less done in this discussion. I think we may just call it to a close in the next dev meeting.

Ludee commented 4 years ago

OEO-TimeSeries

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 4 years ago

Yes, but also time series. We should add this relation for time step and time series.

For start time and end time I agree. But does anybody use time stamps for entire time series?

sfluegel05 commented 4 years ago

For start time and end time I agree. But does anybody use time stamps for entire time series?

Maybe we can leave that part out at the moment. We can still implement it when we find someone who does use it.

After we discussed this issue in dev-meeting 10, I will implement the part concerning time stamp. Also, I suggest to open two new issues, one about aggregation which is needed to describe a time step and another about start time and end time which don't work the way they should at the moment.

OpenEnergyPlatform / ontology