Closed greenTara closed 8 years ago
It appears to me that a "Time Series" profile, where each RDF stream has exactly one timestamp predicate, and the range of that predicate is xsd:dateTime is a commonly occurring special case.
Can we consider the totally ordered time series as another fragment, or is it just a subclass of time series?
Il giorno Mar 23 Feb 2016 16:03 Tara Athan notifications@github.com ha scritto:
It appears to me that a "Time Series" profile, where each RDF stream has exactly one timestamp predicate, and the range of that predicate is xsd:dateTime is a commonly occurring special case.
— Reply to this email directly or view it on GitHub https://github.com/streamreasoning/RSP-QL/issues/55#issuecomment-187732047 .
Sent from my Android phone. Please excuse my brevity.
What is the property you have in mind for a totally ordered time series? Possibilities are:
Considering the algebraic properties of merge and union, these would be defined for Time Series that have the same timestamp predicate, so it would be a kind of sorted algebra. The first kind of "totally ordered" above would not affect this behavior. The second kind would rule out some merge/unions, in the case when two time series have a timestamp in common.
Another note regarding merge/union. It may be useful to distinguish
I was thinking to the case where the timestamp predicate is a one-to-one relation between stream items and time (is bijective) and the range of the relation is totally ordered. If I am correct, it means both the conditions you described. I think that condition 2 is too strong for the general time series class (while 1 makes sense to me) but it still denotes a relevant (sub-)class of streams.
Il giorno Mar 23 Feb 2016 17:20 Tara Athan notifications@github.com ha scritto:
Considering the algebraic properties of merge and union, these would be defined for Time Series that have the same timestamp predicate, so it would be a kind of sorted algebra. The first kind of "totally ordered" above would not affect this behavior. The second kind would rule out some joins, in the case when two time series have a timestamp in common.
— Reply to this email directly or view it on GitHub https://github.com/streamreasoning/RSP-QL/issues/55#issuecomment-187770496 .
Sent from my Android phone. Please excuse my brevity.
I agree on the first profile of a single timeseries (total ordered timestamps), but for the resulting stream from multiple streams generated by the application of streams operations ( merging / union) I think we should relax the bijective condition to a surjective function or relax the totally ordered to partially ordered relation. (maybe this is a second profile / subclass)
(Edited)
OK, so here is an updated proposal An "RDF time series" is an RDF Stream that:
An "RDF distinct time series" is an RDF time series such that no two elements in the time series have the same timestamp.
Note: when I talk about the "range of the timestamp predicate", I am referring to the definition of the timestamp predicate, as in :p a rsp:TimestampPredicate, rdfs:range xsd:dateTime. not as in the set of timestamps that occur in a particular stream. It is important to consider all possible values, because we want to characterize the merge/union operations over all possible streams of a particular subclass. In the case of RDF time series, we should be able to say that it is possible to merge any two RDF time series that use the same timestamp predicate and have disjoint graph names to get another RDF time series. For this to be possible, the entire range of the timestamp predicate must be totally ordered, not just the timestamps that occur in a particular stream.
Regarding surjective/bijective mappings, the item 3. of the time series definition makes the relation from graph names to timestamps functional (i.e. a mapping). The extra condition in the definition of distinct time series makes that mapping injective. Surjective I don't know - over what set of values might one want the time series to be surjective (onto)? It is trivially surjective over its own set of timestamps. A further subclass might be distinct time series with a particular duration between timestamps (regular time series).
the range of the timestamp predicate is timezoned value objects of xsd:dateTime
this reads as if it could just require xsd:dateTimeStamp (https://www.w3.org/TR/xmlschema11-2/#dateTimeStamp)
Thanks, I didn't know about that derived datatype. Perfect.
I edited the original.
See pull request https://github.com/streamreasoning/RSP-QL/pull/59
In the pull request, there are some subprofiles defined which introduce additional properties where there are no duplicate timestamps in the stream and where the timestamps are equally spaced. We should agree on the terminology of these subprofiles.
Regarding names of the profiles, this is relevant. https://www.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.tms.doc/ids_tms_018.htm
Here is a usage of "synchronous data streams" which is about the relationship between the streams, not a property of a single stream. But still related to the modifier "synchronous". http://www.seas.upenn.edu/~sudipto/mypapers/kdd.pdf
Googling gives me no relevant links for distinct time series, which gives the advantage that there is no existing meaning for this phrase.
was the second link intended to be to something other than the same ibm knowledge center page as the previous one?
i also find only insubstantial use of the "synchronous" term. one significant characteristics is that the effective timestamps are synchronized with some external time source. this means that in this context, the notion of the synchronous relation between two otherwise autonomous streams cannot apply as that, taken to the extreme, requires no absolute time location and no quantitatively defined interval.
Indeed, the second link was meant to be something else. I have edited.
that second paper is not useful as it uses "synchronous" to characterise streams which are both correlated and synchronized with an external clock, but takes no care to set the two features apart even though its early text is clear that their principle concern is the correlation.
Agreed, "synchronous" would seem to be available for us to use and set a meaning that suits us, and we could use that instead of "distinct" to indicate that each element of the stream must strictly follow the previous one, as in a synchronous computation, if that is preferred.
"Regular" seems to me to be best suited for the case when the timestamp can be serialized as an integer based on the assumption of a reference time and a time offset. It would still be okay to have an arbitrary number of stream elements associated with each integer, which allows the case of repetition of time stamp and also the important case of missing elements.
"Regular synchronous" would almost be the final subprofile that you describe, except that it would still allow missing values.
How about "perfect time series" for the case when there is exactly one stream element for each time stamp on a regular grid?
On Mon, Apr 4, 2016 at 11:48 PM, james anderson notifications@github.com wrote:
that second paper is not useful as it uses "synchronous" to characterise streams which are both correlated and synchronized with an external clock, but takes no care to set the two features apart even though its early text is clear that their principle concern is the correlation.
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/streamreasoning/RSP-QL/issues/55#issuecomment-205511883
these provide useful insight into customary meanings for these words
it indicates that "synchronous" is not entirely freely available and that it implies correlation.
it is difficult to compare this usage to that in earlier comments, because the original concerns "streams" in which there can be just one event at any given location both the abstract and the concrete levels while rdf streams permit distinct events at the concrete level to correspond to the same abstract location. notwithstanding which, one possible interpretation is that, on the abstract level,
On further reflection, I think "synchronous" is not the best term because of the potential for confusion with synchronization over multiple streams. Is the original proposal of "distinct" (analogous to SQL distinct operation that elimates duplicates) an acceptable alternative?
I agree that 'synchronous' can be a bit misleading
In order to have a concrete proposal for the call, I am going to now make a commit where the terms "distinct" and "regular" are used.
Certain usecases or application domains do not need the full generality of the RDF stream definition, and so may be able to implement more efficient reasoning methods when the input is confined to be some subclass of RDF streams. It is common to call such subclasses "profiles" (e.g. OWL profiles RL, EL, QL). A new section of the Abstract Syntax and Semantics document should be devoted to defining and naming some important profiles.