Open durple opened 10 years ago
think this has different answers depending on the dimensionality of the series..
in 1D (like views on a page) you could make a case that volume is a good indicator of useful, or maybe variance? in >1D I bet covariance would be a good starting point.
A useless time series then is one that is always zero, or more generally alway the same.
hey also I bet there is a proper answer to this in terms of information content. Like a useful timeseries is one that is hard to compress, has low entropy etc. There's a lovely green book on my desk by Mackay that has opinions...
I am more curious as to how this makes sense for a user
Whatever determination you use, it means that there will be a result for some queries and no result for others. And neither of those results would necessarily mean "because there was no data"
which is confusing to me.
That is a very fat book!
Whatever determination you use, it means that there will be a result for some queries and no result for others.
This is quite implementation specific, I think. We could implement tick such that a user knows what time series are being made available once it starts listening to the stream.
But you are right, if I have A, B & C and Tick determined that A, A & B are the only useful time series but the user was interested in B & C. I don't know how to handle that. It almost becomes a back to the drawing board problem to solve.
What about thinking of it more as a compression problem? If there is no information in the series given the other time series then you should be able to recreate the series using other series at query time...
Alternatively, a "no information" response to a query is an interesting thing for a db to respond with...
M On Nov 4, 2014 1:21 PM, "Deep Kapadia" notifications@github.com wrote:
Whatever determination you use, it means that there will be a result for some queries and no result for others.
This is quite implementation specific, I think. We could implement tick such that a user knows what time series are being made available once it starts listening to the stream.
But you are right, if I have A, B & C and Tick determined that A, A & B are the only useful time series but the user was interested in B & C. I don't know how to handle that. It almost becomes a back to the drawing board problem to solve.
— Reply to this email directly or view it on GitHub https://github.com/nytlabs/tick/issues/2#issuecomment-61688264.
Still wrapping my head around thinking of it as a compression problem...just grabbed the green book
Alternatively, a "no information" response to a query is an interesting thing for a db to respond with...
But is it useful if I am looking for something very specific?
Alternatively, a "no information" response to a query is an interesting thing for a db to respond with...
only if it can be explained simply
If you have timeseries for each key, couldn't you create what A&B would be? why do you need a time series for groups?
Oh right, intersection vs exclusive. oh well
Can I have table "key" with row "co occurrence" by time?
If you have timeseries for each key, couldn't you create what A&B would be? why do you need a time series for groups?
No. Consider for example the following stream:
{user: Deep, location: NYC, ts:1}
{user: Deep, location: NJ, ts:1}
{user: Nik, location: NYC, ts: 1}
{user: Nik location: SFO: ts 1}
{user Mike, location: NYC, ts:1}
{user Deep, location: NYC, ts:1}
user
Deep ->(ts:1, count:3)
Nik ->(ts:1, count:2)
Mike->(ts:1, count1)
location
NYC -> (ts:1,count: 4)
NJ -> (ts:1, count: 1)
SFO ->(ts:1, count:1)
And if my question is give me all the times Deep was in NYC, I can't decipher it from the above time series. I can however decipher it from
user,location
Deep,NYC->(ts:1,count:2)
Deep,NJ->(ts:1,count:1)
Nik,NYC->(ts:1,count:1)
Nik,SFO->(ts:1,count:1)
Mike,NYC->(ts:1,count:1)
Oh right, intersection vs exclusive. oh well
Great! I spent 5 minutes building time series by hand from a stream of imaginary JSON.
sorry :anguished:
Can I have table "key" with row "co occurrence" by time?
Not sure if I understand. Isn't that the same as having more than one column as a primary key? If so, it becomes the same as what I mentioned in the example
what is wrong with that?
amen re: explaining no data
So we could go one of the three ways here: