noi-techpark / bdp-core

Open Data Hub / Timeseries Core
https://opendatahub.com
Other
9 stars 4 forks source link

As an e-mobility expert I would like to extract all historical active e-mobility stations from the ODH #193

Closed rcavaliere closed 3 years ago

rcavaliere commented 3 years ago

Request from STA to get from the ODH a list of all e-mobility stations in the production environment, ordered by activation timestamp.

bertolla commented 3 years ago

This might be a first input: https://mobility.api.opendatahub.bz.it/v2/flat/EChargingStation/number-available/latest?limit=-1&where=sactive.eq.true&select=scode,mtransactiontime

rcavaliere commented 3 years ago

@Piiit @bertolla I have tried to make some tests here. That's what I would need, but calculated on top of the measurement table: select name,origin,stationcode,station_id,timestamp,pointprojection from intimev2.measurement inner join intimev2.station on station_id = intimev2.station.id where stationtype = 'EChargingStation' and active = true order by name How can we apply this to measurementhistory table in order to get the oldest record of a station?

Piiit commented 3 years ago

@rcavaliere Hi, sorry for the late reply... you could do something like this:

select 
    name
    , origin
    , stationcode
    , station_id
    , pointprojection
    , min(timestamp) 
from measurementhistory m 
join station s on s.id = m.station_id 
where 
        active = true 
        and stationtype = 'EChargingStation'
group by 1, 2, 3, 4, 5
order by 1, 2, 3, 6;

It will take some time... I guess around 10-15 minutes.

Piiit commented 3 years ago

BTW, is this a single query to answer a question or should it be a recurring task in the future? I just ask, because if yes, we could add some indexes. However, I would not do it directly now, since they need around 50GB each for the history table.

Piiit commented 3 years ago

...and after 14 minutes we got a result :-)

echargingstations-first-activation.csv.tar.gz

Piiit commented 3 years ago

Same result as json:

echargingstations-first-activation.json.tar.gz

Piiit commented 3 years ago

@rcavaliere Done, please let me know, if you need something more

rcavaliere commented 3 years ago

BTW, is this a single query to answer a question or should it be a recurring task in the future? I just ask, because if yes, we could add some indexes. However, I would not do it directly now, since they need around 50GB each for the history table.

I assume it is a single query. I have already told STA that in future they should try to automatically compute these statistics may making usage of our API...

rcavaliere commented 3 years ago

@Piiit Wonderful, it is exactly what they needed. Task closed!

rcavaliere commented 3 years ago

@Piiit I have got a feedback from STA. The list is OK, but they would like to have two additional columns:

However the situation here is a little bit complex since the relationship is 1:N, i.e. an EChargingStation can have N EChargingPlug stations. The logic to be applied should be the following:

if at least one EChargingPlug has as outletTypeCode CCS or CHADEMO -> the type of charging has to be set as "quick" 
otherwise -> "slow"

There is also another complexity which is that each station can have multiple metadata records associated, since the characteristics of a station could have changed from time to time. Assuming that we consider the most recent one, would that request be feasible? Could a table view help to provide these requests?

rcavaliere commented 3 years ago

@Piiit can you please provide me a feedback in relation to the comment above, with the requests received by STA?

Piiit commented 3 years ago

@rcavaliere I will have a look at this request next sprint, moving it back to todo

Piiit commented 3 years ago

@rcavaliere The query for the most recent stations is as follows

select 
    s.name
    , s.origin
    , s.stationcode
    , md.json->'accessType' as access_type
    , case when plugmd.json->'outlets'->0->>'outletTypeCode' in ('CSS', 'CHAdeMO') 
        then 'quick' 
        else 'slow' 
      end as charging_type
    , s.pointprojection
    , min(timestamp) 
from measurement m 
join station s on s.id = m.station_id
join metadata md on s.meta_data_id = md.id
join station plug on plug.parent_id = s.id
join metadata plugmd on plug.meta_data_id = plugmd.id
where 
        s.active = true 
        and plug.active = true
        and plug.stationtype = 'EChargingPlug'
        and s.stationtype = 'EChargingStation'
group by 1, 2, 3, 4, 5, 6
order by 1, 2, 3, 4, 5;

Just replace measurement with measurementhistory if you like to have the oldest station, but that station will then have the actual metadata record assigned, and that could be problematic. Since at that old time, the metadata was maybe different, as you said already. Unfortunately, we do not have a metadata record exactly assigned to a station at a certain point in time, but just record changes in the metadata record as history. We would need to associate such a metadata record with the oldest timestamp of the station, and then find the nearest neighbor in the metadata history table. That would be really time consuming, but feasible. Is it needed? In the meantime, I start the above query for the measrumenthistory table. Lets see how long it takes :-)

Piiit commented 3 years ago

@rcavaliere Results of the above query with "measurementhistory": echargingstation-first-activation-v2.json.tar.gz echargingstation-first-activation-v2.csv.tar.gz

However, please be careful with the data, since the assigned metadata records are from another time period as the station data itself.

rcavaliere commented 3 years ago

@Piiit super work! No, let's give this list, is more than enough... if they want the metadatahistory we have to discuss this, we can think to provide the detail of particular stations, but not the whole dataset that we have. I think it is a fantastic extraction result for their need, many thanks!

rcavaliere commented 3 years ago

@Piiit there is something that does not work as expected. Please have a look at the example here, this station is marked as slow and it should be labeled as "quick"... maybe because we have multiple connectors associated to a single plug? Let me know! Screenshot from 2021-03-04 11-03-33

Piiit commented 3 years ago

@rcavaliere The stations can have multiple outlets also mixed with slow/fast charging capabilities. I had a look only on the first outlet and thought that all outlet speeds are then the same. This new query gives all outlets-speeds as an array, together with the other information:

with 
subq as
(
    select 
        s.name
        , s.origin
        , s.stationcode
        , md.json->'accessType' as access_type
        , s.pointprojection
        , plugmd.json->'outlets'->>'outletTypeCode' as outlet_type_code
        , min(timestamp) as first_measurement
    from measurementhistory m
    join station s on s.id = m.station_id
    join metadata md on s.meta_data_id = md.id
    join station plug on plug.parent_id = s.id
    join metadata plugmd on plug.meta_data_id = plugmd.id
    where 
        s.active = true 
        and plug.active = true
        and plug.stationtype = 'EChargingPlug'
        and s.stationtype = 'EChargingStation'
    group by 1, 2, 3, 4, 5, 6
)
select 
    name
    , origin
    , stationcode
    , access_type
    , pointprojection
    , first_measurement
    , jsonb_agg( 
        case when otc in ('CSS', 'CHAdeMO') 
            then 'quick' 
            else 'slow' 
        end 
    ) as charging_type
from 
    subq,
    jsonb_array_elements(outlet_type_code) otc
group by 1, 2, 3, 4, 5, 6
order by 1, 2, 3, 4, 5, 6;
Piiit commented 3 years ago

@rcavaliere This query crashed. It used to many resources on the server. Now the last query, which finally worked

with 
subq as
(
    select 
        s.id as sid
        , min(timestamp) as first_measurement
    from measurementhistory m
    join station s on s.id = m.station_id
    where 
        s.active = true 
        and s.stationtype = 'EChargingStation'
    group by 1
)
select 
    s.name
    , s.origin
    , s.stationcode
    , first_measurement
    , md.json->'accessType' as access_type
    , jsonb_agg( 
        case when otc->>'outletTypeCode' in ('CSS', 'CHAdeMO') 
            then 'quick' 
            else 'slow' 
        end 
    ) as charging_type
    , s.pointprojection
from 
    station s 
    , metadata md 
    , station plug 
    , metadata plugmd 
    , subq
    , jsonb_array_elements(plugmd.json->'outlets') otc
where 
    plug.active = true
    and plug.stationtype = 'EChargingPlug'
    and s.id = subq.sid
    and s.meta_data_id = md.id 
    and plug.parent_id = s.id 
    and plug.meta_data_id = plugmd.id
group by 1, 2, 3, 4, 5, 7
order by 1, 2, 3, 4, 5, 6, 7

Results: echargingstations-first-activation-v3.csv.tar.gz echargingstations-first-activation-v3.json.tar.gz

rcavaliere commented 3 years ago

Perfect, before close the user story let's wait for the feedback of STA. But this is exactly what they wanted...

rcavaliere commented 3 years ago

Positive feedback by STA received, issue closed.