orchestracities / ngsi-timeseries-api

QuantumLeap: a FIWARE Generic Enabler to support the usage of NGSIv2 (and NGSI-LD experimentally) data in time-series databases
https://quantumleap.rtfd.io/
MIT License
38 stars 49 forks source link

Quantumleap is joining all time-series for all attributes for entity type in one table. #738

Closed Greenstreet123 closed 10 months ago

Greenstreet123 commented 1 year ago

Hello We are using Quantumleap and we have entities in context broker with multiple attributes that may have different timestamps.

The the main question is why Quantumleap is joining all time-series for all attributes for entity type in one table. Attributes have different timestamps, but in this table structure we will lose this information. On data fetch we will get "fake" measurement points for time series.

As result we have a lot of useless duplicate data in the database. This affects speed too, mainly on REST API level, because of fetching a lot of useless data.

First I wanted to ask if this is to be expected or if we are using it in an incorrect way?

Secondly, if the above is the case, is there any current work in progress to improve this? If not, would you think that something needs to be done and we should look into it making some development for this.

c0c0n3 commented 11 months ago

Hello @Greenstreet123 :-)

So sorry for the late reply, but work has been hectic and this issue fell through the cracks. My bad.

First I wanted to ask if this is to be expected or if we are using it in an incorrect way?

No, you're not doing anything wrong. Your problem is an unfortunate side-effect of our DB design. Each NGSI entity instance gets a single timestamp index in the DB. The reason for that is lost in time but I think the rationale behind it is that we were catering for the most common case where a device measures a whole bunch of params at the same time, sends this batch of measurements on to an IoT agent and the agent packs them into an NGSI entity, one attr per measurement. To sum up

device sends

   (m1, t0), (m2, t0), (m3, t0), ...

agent converts to

   entity 
      m1_attr
         val: m1
         time: t0
       m2_attr
         val: m2
         time: t0
    ...

quantumleap store as

    (t0, m1, m2, ...)

Like you have rightly noticed, this only makes sense if all the measurements where taken at the same time t0 as shown above.

is there any current work in progress to improve this?

Unfortunately not. It's quite a big change and we don't have the budget for that at the moment.

would you think that something needs to be done

I agree with you that a more flexible model where we don't assume a single time index for all the attrs would be better, but like I said it's quite a departure from the current design.

we should look into it making some development for this

Contribs are most welcome!

Anyway, one easy workaround would be to group measurements into entities in such a way that measurements taken at different times are in different entities. For example, if a device measures temperature every 10 mins and humidity every 30 mins, you could have two separate entities one for temperature and the other for humidity.

SBlechmann commented 11 months ago

Hey there,

if I understood your problem correctly, you see "fake measurements" because the context broker notifies quantumleap this way. In NGSIv2, you have the option of setting a flat called "onlyChangedAttrs" to true which should prevent this behaviour. Then you will have NULL values for attributes that haven't changed as placeholder.

Afaik, NGSI-LD does not support this feature.

c0c0n3 commented 11 months ago

ha! thanks for clarifying @SBlechmann, I wasn't sure where those "fake" measurements came from :-) @Greenstreet123 I think @SBlechmann suggestion should get rid of the junk. Another thing that might help is to explicitly configure QuantumLeap to use a certain attribute or header as a timestamp:

Greenstreet123 commented 10 months ago

Thanks for coming back. For us it was not possible to use it QL as is. Too many issues unfortunately.

Regarding "onlyChangedAttrs", when we used it there were a lot of NULL values in the series, and when filtering (min/max/avg, by time) the API started to be buggy. Also, I think we received information that onlyChangedAttrs option should not be used with QuantumLeap.

c0c0n3 commented 10 months ago

@Greenstreet123 sorry to hear about your trouble with QL. But thank you so much for your honest feedback. We definitely agree with you we should improve our data model :-)