Closed akleinau closed 4 years ago
delete to leave out in first release
I don't like the term temporal region. I just googled it to see if thats something you can say in English and its actually a term for a part of the brain ;-)
We could maybe use model property
or model characteristic
instead.
So the def would be:
A time step is a model property that describes the time period between two calculations or measurements made.
etc.
temporal region is an already implemented class of the bfo with the definition: "A temporal region is an occurrent entity that is part of time as defined relative to some reference frame". This fits in my opinion better than model property as it's already there and more specific about the main aspect of this concepts, the description of some time period
I agree, classes time step
and time horizon
should be classified as 1-dimensional temporal regions. And they definitely need to be related to models / scenarios / time series.
Also, time step
needs to be related to a quantity value (e.g. time step = 15 min).
on further thinking I think time step should actually be a quantity value? It is just a description of a portion of time, no fixed moment in time
Also, time step needs to be related to a quantity value (e.g. time step = 15 min).
It is just a description of a portion of time, no fixed moment in time
I think there are two perspectives on time steps: on the one hand a description of a model or scenario (e.g. "how many time steps of what duration does this model calculate?"), and on the other hand the description of data (e.g. "to which time step does this specific datum belong?"). Since both need to be accommodated, the model description either needs to list all time steps that are calculated -- tedious for 35040 15-minute time-slices to a year -- or we need two different "time step" classes.
For the "data perspective", a time step is defined by start and end, because it is a fixed moment in time (or rather, a region of time). The denomination of the time steps is arbitrary and differs between fields. E.g. in meteorology, observations are designated by the end-time of the time interval (so for 15-minute time steps, 08:15 would be the observations between 08:00 and 08:15, if I remember this discussion correctly @carstenhoyerklick), the IPCC data sets for the Assessment Reports uses the mid-point (so 2035 refers to data from the beginning of 2033 to the end of 2037). So
time step [data perspective]
would have two attributes
start time
end time
I don't know what the crucial characteristics for the "model perspective" are. I guess number and size of time steps. I just want to point out that these need not be homogeneous. Our model uses five-, ten- and twenty-year time steps at the same time (because it is computationally cheaper and the temporal resolution is important for the short term, but not so much when you look 100 years out).
Related to OpenEnergyPlatform/ontology#474.
these are two perspectives, yes, one that looks at a single time step and one that looks at all time steps in a model. The underlying concept of „timestep“ remains the same though and should be treated as one concept. That one can have the start and end time you proposed which are used directly for your data perspective. It can also have a duration property, so to describe your model perspective we can just relate the model to the time steps, eg „model has x instances of the timestep class“ and those instances have y duration.
So
on further thinking I think time step should actually be a quantity value? It is just a description of a portion of time, no fixed moment in time
No, I think time step
itself is still a 1-dimensional temporal region. But we should to add further relations and quantity values. For example
time step
has quantity value duration
time step
has quantity value start time
and end time
- one class: Timestep with start, end, duration
This makes me cringe a little, because duration
would be derivative to start time
and end time
, allowing for inconsistent definitions. Would there be a "generic" time step, that only has duration, but no position in time? What would be the use case for it?
use case would be that instead of giving start end end time of every timestep used in a model when looking at the model view we can just state the duration (that is typically the same for all). Yes, I thought about the inconsistency problem too. Maybe just include start and duration? So leave end out?
instead of giving start end end time of every timestep used in a model when looking at the model view we can just state the duration
One doesn't need a class for that. One could also just give the model number of time steps
and duration of time step
attributes.
(that is typically the same for all)
This domain ontology is a joint effort to represent the typical energy-system modelling context based on standard terminologies used by human experts in this field of research.
Fixed. 😎 But probably it makes sense to allow models to either just state the number and duration of time steps, or give a comprehensive list. FYI: there are power sector models using different time step setups depending on the data they run on.
Yes, I thought about the inconsistency problem too. Maybe just include start and duration? So leave end out?
I'm not sure if this has to be addressed in the ontology. Users will always find a way to define nonsensical data. If a "generic time step" having only duration proves useful, then go for it and make the attributes optional. If not, personally, I would default to start time
and end time
, but in the end it doesn't matter.
Ok, here comes a differentiation to not mix up time series
and time step
:
We need time series
with
start time
and (optional?) end time
number of time steps
time step
We need time step
with
duration
start time
and end time
--> and it lies in the responsibility of the users to not assign nonsense?Do you agree?
Do you agree?
I for one do in principle agree, but this is not sufficient.
tl;dr: Time series are not necessarily homogenous (composed of time steps of identical duration), nor continuous. Therefore a time series is not always completely identified by start time
, number of time steps
, and time step
duration
.
- has quantity value
start time
and (optional?)end time
- has quantity value
number of time steps
If "has part some time step
" links to only one "generic" time step
(with duration
, but without start time
or end time
), this presupposes homogeneous time steps, which is not always the case.
A concrete example: the REMIND model uses 19 time steps that vary between five and 20 years in duration. We don't care about the precise start and end times and durations in our work, especially not down to the second, but this would be the formal definition.
number | period | start time | end time | duration |
---|---|---|---|---|
1 | 2005 | 2003-01-01 00:00:00 UTC | 2008-01-01 00:00:00 UTC | 5 years |
2 | 2010 | 2008-01-01 00:00:00 UTC | 2013-01-01 00:00:00 UTC | 5 years |
3 | 2015 | 2013-01-01 00:00:00 UTC | 2018-01-01 00:00:00 UTC | 5 years |
4 | 2020 | 2018-01-01 00:00:00 UTC | 2023-01-01 00:00:00 UTC | 5 years |
5 | 2025 | 2023-01-01 00:00:00 UTC | 2028-01-01 00:00:00 UTC | 5 years |
6 | 2030 | 2028-01-01 00:00:00 UTC | 2033-01-01 00:00:00 UTC | 5 years |
7 | 2035 | 2033-01-01 00:00:00 UTC | 2038-01-01 00:00:00 UTC | 5 years |
8 | 2040 | 2038-01-01 00:00:00 UTC | 2043-01-01 00:00:00 UTC | 5 years |
9 | 2045 | 2043-01-01 00:00:00 UTC | 2048-01-01 00:00:00 UTC | 5 years |
10 | 2050 | 2048-01-01 00:00:00 UTC | 2053-01-01 00:00:00 UTC | 5 years |
11 | 2055 | 2053-01-01 00:00:00 UTC | 2058-01-01 00:00:00 UTC | 5 years |
12 | 2060 | 2058-01-01 00:00:00 UTC | 2065-07-02 12:00:00 UTC | 7.5 years |
13 | 2070 | 2065-07-02 12:00:00 UTC | 2075-07-02 12:00:00 UTC | 10 years |
14 | 2080 | 2075-07-02 12:00:00 UTC | 2085-07-02 12:00:00 UTC | 10 years |
15 | 2090 | 2085-07-02 12:00:00 UTC | 2095-07-02 12:00:00 UTC | 10 years |
16 | 2100 | 2095-07-02 12:00:00 UTC | 2105-07-02 12:00:00 UTC | 10 years |
17 | 2110 | 2105-07-02 12:00:00 UTC | 2120-07-02 12:00:00 UTC | 15 years |
18 | 2130 | 2120-07-02 12:00:00 UTC | 2140-07-02 12:00:00 UTC | 20 years |
19 | 2150 | 2140-07-02 12:00:00 UTC | 2167-07-02 12:00:00 UTC | 27 years |
Other IAMs do this differently. E.g.
The carbon price for a given model year t is usually assumed to be constant over the length of the time step Δt (either from time t-1 to t or from t-Δt/2 to t+Δt/2, depending on the model).
(From the Model Diagnostic Exercise – Study Protocol of the ADVANCE Project)
The Assessment Report data of the IPCC on the other hand is not specific about what a time step like 2030 actually refers to. But one interpretation (held by the people at IIASA, who are hosting the data) is that it denotes that specific year, in which case the time series is not continuous, but has ten-year gaps in-between:
… 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 …
Practically, it makes little difference and is interpolated away. But the ontology must be able to represent it.
We need
time step
with
- has quantity value
duration
- optional has quantity value
start time
andend time
I still don't see the use for the generic time step
, but sure.
--> and it lies in the responsibility of the users to not assign nonsense?
Sure.
tl;dr: Time series are not necessarily homogenous (composed of time steps of identical duration), nor continuous. Therefore a time series is not always completely identified by start time, number of time steps, and time step duration.
I agree. Any ideas for further relations?
If "has part some time step" links to only one "generic" time step (with duration, but without start time or end time), this presupposes homogeneous time steps, which is not always the case.
Indeed. If the relation is has part some time step
, then seveal, also inhomogenous, time steps could be assigned. Please correct, if I am getting this wrong @akleinau.
I still don't see the use for the generic time step, but sure.
What would be your choice?
What would be your choice?
One could also just give the model
number of time steps
andduration of time step
attributes.
One could also just give the model
number of time steps
andduration of time step
attributes.
So-called "attributes" (not really an ontological term, though) are implemented also as classes (often as dependent continuants
) that are related via properties (e.g. has quantity value, has part, ...) to an independent continuant
. See also wiki and OpenEnergyPlatform/oeo-extended#5.
I don't know if it's even possible to implement one class numer of time steps
. How to classify? "number of" is dependent on something, e.g. time steps
. Same for "duration of".
And, if you have generic classes number
, time step
, duration
,... you can also reuse it for other purposes.
How do you plan on representing the hub height of a wind turbine (to use a popular example from the dev meetings)? Is there going to be a class for it, with subclasses for every possible hub height?
Is there going to be a class for it, with subclasses for every possible hub height?
Yes, there is going to be a class hub hight
, defining the concept of hub hight.
But instead of having subclasses (or rather instances) of possible hub hight values, this class hub hight
would be related via has quantity value to a quanity value length value
(or hight value
or however called) which is related per definition to a value (data property, e.g. 140) and a unit (e.g. metre).
Whether the value instances of length value
will be stored within the OEO or not has to be discussed. I can't answer that yet. But I'll put this question on the agenda for the developer meeting on thursday.
Whether the value instances of
length value
will be stored within the OEO or not has to be discussed.
My understanding is that data like that will not be stored in the OEO but in the OEP database as one major use case of the OEO is to annotate the data in the OEP.
Ping @christian-rli : Any thoughts?
this discussion strayed away a bit from the original topic. Hub height got discussed and implemented last dev meeting. This should have cleared things up how classes are used in an ontology. So from that ontology perspective the explanation of @stap-m with genereric classes number, time step, duration... is the common and easy one.
So again the proposition of @stap-m:
We need time series with
has quantity value start time and (optional?) end time has quantity value number of time steps has part some time step
We need time step with
has quantity value duration optional has quantity value start time and end time --> and it lies in the responsibility of the users to not assign nonsense?
and the question for further relations needed for time step to make it completely identifiable?
I put some more thought into this:
start time
and end time
sound either like process boundaries
or zero-dim temporal regions
. start time is a hmhmhm that indicates the beginning of a 1-dim temporal region ? And end time
indicating the end, accordingly.duration
is a quantity value that can be related to both time step
and time series
. duration is a quantity value indicating the time span of a 1-dim temporal region, measured in a time unit. ?time step
: a time step is a 1-dim temporal region that has a start time and an endtime and thus a finite duration. ? time series
: time series is a data item that reference to a set of time steps or zero-dim temporal regions. ? start time
/end time
were process boundaries
, then there would have to be a process
that is bound, and that would _"s-dependson" (whatever that means) "some material entity". Time series are clearly independent of material entities.So nay to process boundary
, yay to everything else.
s-depends on means specifically-depends-on, which means the process can't exist without the material entity. But I agree, then they are zero-dim temporal regions.
so relations to the defs above:
is about some 1-dim temporal region
and has unit some time unit
has part some start time
and has part some end time
and has part some duration
is about some (time step or zero-dim temporal region)
@stap-m if you agree I would implement? Should I wait for the 1.1 release before starting?
I agree. Further:
time series
as data set
(subclass of data item
)?! time step
- they don't have to be used for annotation.I am ok with implementing. But please leave this issue open. We didn't discuss time horizon
and I guess time series
is not yet finished.
Hi everybody, sorry for entering the discussion late, but we may also need to look at another term which may be important in this discussion, which is time stamp.
We discussed about start time and end time which acually is useful to make sure a time step is interpreted the right way. But many time series only have one time information. Meteorologist use in most cases the end of the interval (as mentioned above). So meteorological values with a time stamp of 12:00h and a time step of 1h usually describe a value which is an integral oder average from measurements between 11:01h and 12:00h (if it is minute data). Some data sets would use 11:30 for the 11-12h value, some use 11h. To be able to interprete a time series value, is very important to know how the time interval is acually defined in the data set. To have a complete information you either need:
The latter would apply if it is really an instaneous measurment (e.g. if you would only measre wind speed or radiation for a fraction of a second and don't care what happend inbetween)
To have a complete information you [need] time stamp, duration and a definition of time stamp (begin, middle,end, instantaneous)
Doesn't that all boil down to start time
and end time
?
Some data sets would use 11:30 for the 11-12h value, some use 11h.
To me that's just a label that is attached to the time step in the data set. It is up to the data provider to (by what ever means that should be done at some point) describe the "11:30" item in the data set to mean "a time step
that has some part start time
(11:01) and has some part end time
(12:00)".
Concerning https://github.com/OpenEnergyPlatform/ontology/issues/362#issuecomment-693338836
Irradiation defenitely needs a time step over which it has been recorded.
I would argue that an irradiation measurement must mention the time interval it has been integrated over, in order to be comparable (or most likely be found incomparable) to other measurements. But a "time interval" (generic duration of time that can happen any‑when in time; not a term in the ontology) is different from a time step
, which is a specific region in time, defined by beginning and end.
I would go with the the time step, as especially for irradiation you need to know when this time interval was or is, e.g. if you need to relate it to solar geometry.
Doesn't that all boil down to
and ? Some data sets would use 11:30 for the 11-12h value, some use 11h.
To me that's just a label that is attached to the time step in the data set. It is up to the data provider to (by what ever means >that should be done at some point) describe the "11:30" item in the data set to mean "a time step that has some part start time >(11:01) and has some part end time (12:00)".
In an ideal world yes, if we could force everyone to
In an ideal world yes, if we could force everyone to and that would be perfect. The second best ist let them define the and the , then it would be equivalent. So it is alternatives.
🤔
In an ideal world yes, if we could force everyone to <start time> and <end time>
that would be perfect. The second best ist let them define the <duration> and
the <time stamp>, then it would be equivalent. So it is alternatives.
Don't make up your own HTML tags ;) (Or maybe tell your e-mail client not to.)
But we can force them, by structuring the ontology in that way! Data providers will have to annotate their data set in any case, and hopefully will do so programmatically. Extracting start time
and end time
is only marginally more elaborate than extracting time stamp
and duration
,
switch (timestamp_meaning) {
case TS_BEGIN:
start_time = timestamp;
end_time = timestamp + duration;
break;
case TS_END:
start_time = timestamp - duration;
end_time = timestamp;
break;
case TS_MIDDLE:
start_time = timestamp - duration / 2;
end_time = timestamp + duration / 2;
break;
case TS_INSTANTANEOUS:
start_time = timestamp;
end_time = timestamp;
break;
}
and anybody who can't manage that will fail several times over in other areas with the ontology.
time steps
(start time
and end time
[SE] or time stamp
, duration
, meaning of time stamp
[TsDM]), there would have to be a conversion between them on the Databus in any way, in order for users to extract data in their preverred format. So the [TsDM] → [SE] conversion could also be used up front, allowing easy uploading to the Databus, and not burdening the ontology with two equivalent yet different definitions.I would go with the the time step, as especially for irradiation you need to know when this time interval was or is, e.g. if you need to relate it to solar geometry.
Good point. I yield to the expert ;)
Sorry I edited directly on GitHub, .. I misinterpreted the coding style.
I am mostly convinced, with the only exception, we can force for new data sets, but what do we do if want to annotate old existing data sets? Convert them to start time
and end tim
e and republish them?
p.s. also holds true for platforms that already regulary publish data as transparancy plattforms. We would need to force them to different publishing formats.
Convert them to
start time
andend time
and republish them?
You seem to assume some automatic connection between the data set and the ontology. As I understand it, all these connections have to be explicitly states, so "old" and "new" makes no difference.
but what do we do if want to annotate old existing data sets
What do these data sets look like? I don't know, but I imagine something like
geographic reference | time stamp | Irradiation |
---|---|---|
some grid square | 10:30 | some number |
some grid square | 11:30 | some number |
some grid square | 12:30 | some number |
... | ... | ... |
with meta data attached specifying that "time stamp" means "the middle of the time step" and "time step duration is 1 hour". Then there has to be a link detailing the connection between the data item "11:30" and the ontology element "a time step
that has some part start time
(11:01) and has some part end time
(12:00)."
I'm not sure how this connection is to be made (see LOD-GEOSS Redmine, maybe @Ludee can calrify), but I don't see any difference between "old" and "new" data sets. If the data set had the format
geographic reference | start time | end time | Irradiation |
---|---|---|---|
some grid square | 10:01 | 11:00 | some number |
some grid square | 11:01 | 12:00 | some number |
some grid square | 12:01 | 13:00 | some number |
... | ... | ... | ... |
there still would be to have a link saying "this specific line concerns "a time step
that has some part start time
(11:01) and has some part end time
(12:00)."
Also holds true for platforms that already regulary publish data as transparancy plattforms. We would need to force them to different publishing formats.
Maybe this might need a wider discussion or explanation on a OEO dev call. My understanding (that might be incorrect) is that the ontology is agnostic to the format data is in, but used to annotate the meaning of the data.
But if different formats are needed, isn't (automatic) republishing in "better" formats a core feature of the Databus? ;)
To my understanding, we may need all the definitions be able to anntoate all the data, also before republishing it. To be able to interprete the time information, we may need also the time stamp
and duration
concepts for those who are not willing or able to include the start time
and end time
information in their data sets.
time stamp
definitely is a common term among energy system modelers, thus it should be part of the OEO.
We had a discussion about time stamps when we implemented time series metadata for the OEP. For the metadata we soved it like this (here's the full example file):
"temporal": {
"referenceDate": "2016-01-01",
"timeseries": {
"start": "2017-01-01T00:00+01",
"end": "2017-12-31T23:00+01",
"resolution": "1 h",
"alignment": "left",
"aggregationType": "sum"
}
alignment
means 11:01 (left) or 11:30 (centre) or 12:00 (right), referring to the above mentioned example.
aggregation type
could be sum/integrated, mean, instantaneous
Maybe this could help for a solution.
What's also missing is a concept for time standards like UTC, CET, ...
I think this is a good way wich alignment
and aggrgation type
.
Currently we have (among others) the following axioms for time series
:
has part
some start time
has part
some ending time
In order to make time stamp
usable we should replace them with (has part some start time and has part some end time) or (has part some time stamp and has part some alignment)
We also need definitions for the classes:
time stamp
: A time stamp is a zero-dimensional temporal region that is used to describe a time series.alignment
(maybe time stamp alignment
would be more clear): An alignment is a data descriptor that indicates the position of a time stamp in a time series.
We could add left alignment
, centre alignment
and right alignment
as Individuals and make them Instances of alignment
(this would be analogous to data format
and its instances)In General, it sounds very reasonable, only that the second option also needs a duration
, so it would be something like ... or (has part some time stamp and has part some duration)
. Otherwise ´center allginment´ and ´right alligment´ are undefined.
~time series
~ time step
(has part
some start time
and has part
some ending time
) or (has part
some time stamp
and has part
some duration
and has part
some alignment
)
~
time series
~time step
(has part
somestart time
andhas part
someending time
) or (has part
sometime stamp
andhas part
someduration
andhas part
somealignment
)
Yes, but also time series
. We should add this relation for time step
and time series
.
this issue has 42 comments. Maybe it is a good idea to define an upper limit like 30 comments, after which an issue should be discussed in a dev meeting as it got too complex?
I think, we are more or less done in this discussion. I think we may just call it to a close in the next dev meeting.
Yes, but also
time series
. We should add this relation fortime step
andtime series
.
For start time
and end time
I agree. But does anybody use time stamp
s for entire time series
?
For
start time
andend time
I agree. But does anybody usetime stamp
s for entiretime series
?
Maybe we can leave that part out at the moment. We can still implement it when we find someone who does use it.
After we discussed this issue in dev-meeting 10, I will implement the part concerning time stamp
.
Also, I suggest to open two new issues, one about aggregation
which is needed to describe a time step
and another about start time
and end time
which don't work the way they should at the moment.
Description of the issue
TimeStep, TimeHorizon and TimeSeries are currently variables and without a definition.
Ideas of solution
TimeStep: A TimeStep is a temporal region (?) stating the time between two calculations or measurements made.
TimeHorizon: A TimeHorizon is a temporal region (?) stating a specific point in time at which specific events will be reviewed or should end.
TimeSeries: A TimeSeries is a data set storing data indexed by time.
Workflow checklist
I am aware that