Closed santilland closed 8 months ago
Agreed to keep all of the currently used indicators. The currently only updates ones are Covid data (daily vaccinations) and oilx data (to be updated for at least 1 more years), the other two are not updated anymore.
The idea is to convert the static indicators to CSVs and convert to GEODB tables. @lubojr to provide more info here in the issue on the expected format (columns) in GEODB for all 4 indicators.
@AlessandroScremin @dmoglioni These are the output files and formats that we are generating for these special datasets and would need to be migrated to GeoDB if possible.
OILX data - actually a standard eodash data format:
https://github.com/eurodatacube/eodash/blob/staging/app/public/eodash-data/internal/100011-OX.json
Information about POI itself is in https://raw.githubusercontent.com/eurodatacube/eodash/staging/app/public/data/internal/pois_eodash.json
https://github.com/eurodatacube/eodash/blob/staging/app/public/eodash-data/internal/AE-GG.json https://github.com/eurodatacube/eodash/blob/staging/app/public/eodash-data/internal/AE-CV.json https://github.com/eurodatacube/eodash/blob/staging/app/public/eodash-data/internal/AE-OW.json
These are per-country entries that get created by the scripts listed in the original issue description.
Essentially all files with suffix -OW, -CV or -GG are of interest to ingest data from.
Regarding information about POI itself, these get generated by the https://github.com/eurodatacube/eodash/blob/staging/app/src/scripts/create_capitals.py but do not change over time (do not get updated)
they are saved to https://raw.githubusercontent.com/eurodatacube/eodash/staging/app/public/data/internal/pois_trilateral.json - search for with "indicator": "GG", "indicator": "CV", "indicator": "OW"
@dmoglioni
We have evaluated the current structure of these indicators GG,OW,CV and we would actually for the migration suggest to change the data structure to the standard eodash geodb table format as all other indicators are using where a single row is a single measurement (and whatever other values we put in the "referenceValue" array) for a certain time.
So coming back to what it means for planned migrating of these 4 datasets to GEODB:
OX
data - measurement_value
, indicator_value
will be holding value, regularly updated indicator
OW
data - we only use total_vaccinations
, people_fully_vaccinated
and daily_vaccinations
in the client, so let's migrate only these. I would suggest to put daily_vaccinations
a measurement_value
and the other two as part of referenceValue
array, not updated anymoreCV
data - single value: confirmed
- can be used as measurement_value
, not updated anymoreGG
data - multiple values, lets put grocery
as measurement_value
and the rest as referenceValue
in order: "parks", "residential", "retail_recreation", "transit_stations", "workplaces", not updated anymore@lubojr
I went through the material provided and would like to align with you on some aspects. Let's have a call to iterate on this faster. Thank you.
@lubojr
As agreed during our call, I'll proceed with the integration on our CI/CD workflow as in the following:
@dmoglioni Regarding the european aggregated OILX Data EU1-OX
, lets move the data into a new table - for example OX-EU (and let's make the indicator code change from OX to OX-EU for this one)
so there will be one table for all POIs for OX and another table for just the european level OX-EU
@lubojr what geometry information should be attached to OX-EU? Here an example of the .json obtained for OX-EU. EU1-OX.json
@dmoglioni For eodash the geometry does not matter, we do not use the geometry field. You can use a single point geometry at the location of the AOI. For the AOI lets use for example coordinates of Munich 48.13,11.57
(arbitrary location I chose now) - The subAoi can also be kept empty,
@lubojr I also wanted to do something like this, assigning an arbitrary AoI for the geometry field but I wanted to be sure you were not using geometry information for visualization purposes.
@lubojr I'm proceeding with OX-EU integration and this is the information available for it:
Should we add something about (for example):
@dmoglioni yes these three sound fine. Nothing else needed to be added. Thank you.
@lubojr
OX/OX-EU indicators are now operational and their data can be fetched for the dashboard from the collections Crude_Oil_Storage_Index and Crude_Oil_Storage_Index-Europe respectively.
Hi @dmoglioni Please update the aoi_id
value in table Crude_Oil_Storage_Index-Europe
to something else than "/" - even though this table has only a single AOI, it is not possible that it is blank. Please set it to for example Europe
. Thank you
Hi @lubojr I added 'EU' as AOI_ID for Crude_Oil_Storage_Index-Europe as requested.
About Google mobility data I found that the geometry information (extracted from pois_eodash.json) for this timeseries is available for only 35 country out of the 135 required.
In particular, these are the countries listed in the timeseries (135): 'AE', 'AF', 'AG', 'AO', 'AR', 'AT', 'AU', 'AW', 'BA', 'BB', 'BD', 'BE', 'BF', 'BG', 'BH', 'BJ', 'BO', 'BR', 'BS', 'BW', 'BY', 'BZ', 'CA', 'CH', 'CI', 'CL', 'CM', 'CO', 'CR', 'CV', 'CZ', 'DE', 'DK', 'DO', 'EC', 'EE', 'EG', 'ES', 'FI', 'FJ', 'FR', 'GA', 'GB', 'GE', 'GH', 'GR', 'GT', 'GW', 'HK', 'HN', 'HR', 'HT', 'HU', 'ID', 'IE', 'IL', 'IN', 'IQ', 'IT', 'JM', 'JO', 'JP', 'KE', 'KG', 'KH', 'KR', 'KW', 'KZ', 'LA', 'LB', 'LI', 'LK', 'LT', 'LU', 'LV', 'LY', 'MA', 'MD', 'MK', 'ML', 'MM', 'MN', 'MT', 'MU', 'MX', 'MY', 'MZ', 'NA', 'NE', 'NG', 'NI', 'NL', 'NO', 'NP', 'NZ', 'OM', 'PA', 'PE', 'PG', 'PH', ' PK', 'PL', 'PR', 'PT', 'PY', 'QA', 'RE', 'RO', 'RS', 'RU', 'RW', 'SA', 'SE', 'SG', 'SI', 'SK', 'SN', 'SV', 'TG', 'TH', 'TJ', 'TR', 'TT', 'TW', 'TZ', 'UA', 'UG', 'US', 'UY', 'VE', 'VN', 'YE', 'ZA', 'ZM', 'ZW']
whereas those are the ones available (35) out of the pois file: 'AT', '48.2,16.366667'), ('BA', '43.87,18.42'), ('BE', '50.83333333,4.3333330000000005'), ('BG', '42.68333333,23.316667000000002'), ('CH', '47.451542,8.564572'), ('CZ', '50.08333333,14.466667000000001'), ('DE', '52.51666667,13.4'), ('DK','55.66666667,12.583333'), ('EE', '59.43333333,24.716667'), ('EG', '30.939554,32.314923'), ('ES', '40.416775,-3.70379'), ('FI', '60.16666667,24.933332999999998'), ('FR', '48.864715999999994,2.349014'), ('GB', '52.48,1.89'), ('GR', '37.98333333,23.733333'), ('HR', '45.8,16.0'), ('HU', '47.5,19.083333'), ('IE', '53.31666667,-6.233333'), ('IT', '41.902782,12.496366'), ('LT','54.68333333,25.316667000000002'), ('LU', '49.6,6.116667'), ('LV', '56.95,24.1'), ('MK', '42,21.43'), ('MT', '35.88333333,14.5'), ('NL', '52.35,4.9166669999999995'), ('NO', '60.197552,11.100415'), ('PL', '52.25,21.0'), ('PT', '38.71666667,-9.133333'), ('RO', '44.43333333,26.1'), ('RS', '44.83,20.5'), ('RU', '54.729095, 19.823546'), ('SE', '59.33333333,18.05'), ('SI', '46.05,14.516667000 000002'), ('SK', '48.15,17.116667'), ('TR', '40.982555,28.820829').
Attached for completeness the pois file I'm using. Is it the right file or is there any other source you were getting this geometry information from? Thank you.
Hi @dmoglioni this is a bit tricky and I will leave it up to you to decide what you prefer. This indicator is used for both race and trilateral dashboards, while for race, the list of countries that you listed is correct, but for the trilateral, they are present in the pois_trilateral.json where a much larger subset of countries is used.
It does not matter for the race dashboard if you duplicate the data in GeoDB (split the collections) or leave it as a single containing all of them (from pois_trilateral.json), we already have the code to subset part of the collection for race, while having another subset for trilateral.
Hi @lubojr following up on our today's call on mobility data (GG), I identified 9 countries that are not present in pois_trilateral.json:
['AG', 'AW', 'BB', 'BH', 'CV', 'HK', 'LI', 'MU', 'RE']
Could you please check it? Thank you
@dmoglioni thank you for checking the data. These 9 we are not including in any dashboard (due to missing cross reference in pois_trilateral.json - due to the fact that we did not have a subaoi.
Let's skip them completely during the migration.
@lubojr thank you for the clarification.
@lubojr Mobility data collection created on geoDB (Mobility_data) and data ingested. As agreed the 'Measurement Value' column contains 'grocery' whereas 'Reference value' column contains ['retail_recreation', 'parks', 'transit_stations', 'workplaces', 'residential'].
Let me know if everything is displayed correctly when fetching the data, thank you.
@dmoglioni I have checked the data in table "Mobility_data" and the same issue as with OILX Europe is present. None of the rows have got a aoi_id filled, which we rely upon. Could you please fill them to match the corresponding country column? (TR -> TR).
The OILX Data now work correctly after minor fixed in the eodash client.
@lubojr About Mobility Data: of course I can add that information, it was just not clear to me from our call or your previous comment that the AOI_ID is in general a mandatory field on your side
aoi_id added to Mobility Data.
@dmoglioni there are still some rows where aoi_id
is a /
.
indicator_data = geodb.get_collection("Mobility_data",database="eodash")
indicator_data["aoi_id"].value_counts()['/']
# 974
Could you please double check?
@lubojr thank you for pointing out the issue. Now the data should be correctly ingested in the geoDB.
@dmoglioni Thank you for the update, I confirm that the Mobility_data
indicator can now be fetched from GeoDB and STAC Catalog is now created. I had a look at the integration and due to the fact that we now use the column "City" on the Map Icon hover, currently it shows for example GR
instead of Greece
.
I think it would anyway make sense on our side (client or generator) to make the column to use for name configurable but these data are currently missing in the table.
In the original data in pois_trilateral.json it was following way:
"aoiID": "BE",
"city": "Belgium",
"country": "BE",
Which does not make sense looking at it backwards. Now it is following way:
"country": "AE"
"city": "/"
"aoi_id": "AE"
In order for the map POI to have a correct label, could you please change the table so that the "Country" column has the original "city" value from JSON ("Belgium")? The city can then remain /
and I shall adapt the generator for this collection so that the "country" column is used for the id.
@lubojr Interestingly enough, I was noting the same thing before you wrote me and wanted to update the Country attribute with its name instead of the ID. I'll follow the same approach also for CV and OW.
@lubojr Mobility data updated as agreed, hope everything is in line now
@dmoglioni We have updated the config to use the country
column if city is blank or /
.
@santilland As you might be aware of, unfortunately the last integration of the two remaining indicators (CV and OW) is blocked due to the lack of access to our processing environment in EDC.
On the side note after the infrastructure problems get fixed.
Previously, I have not realized that we (in eodash client) can not have -
character in the indicator code. Sorry for complications while proposing the original OX-EU
indicator for the global OILX index.
@dmoglioni Could you please update your script to change the indicator code for all current rows (and future updates) inside the table Crude_Oil_Storage_Index-Europe from OX-EU
to OX_EU
?
@lubojr the script has been updated following your additional requirement; the changes on the geodb will be effective starting from next Monday (Jan, 29th).
@dmoglioni Hello, I just wanted to briefly check what is the current status of the covid data (OW and CV) ingestion to geodb?
@lubojr Hi, thank you for the reminder, I'll let you know as it's completed.
@lubojr CV indicator is now operational on the geoDB under the data collection 'Global_COVID_data'; could you please check if you can fetch all the information correctly?
@dmoglioni Latitude and longitude were switched during the import against other past collections. Could you please fix that and remove the space in between them?
"aoi": "61.210817, 35.650072"
should be "aoi": "35.650072,61.210817"
@lubojr should also the ones for subAOI be flipped?
SubAOI is fine as is.
perfect, reingestion completed
@dmoglioni The coordinates for individual countries are usually on/near the borders instead of in the capitals of the countries as was in the original dataset and as is for mobility data. Could you please double check?
Old: https://race.esa.int/?indicator=CV&x=2177956.42178&y=6562096.62978&z=5.24697 New: https://eodash-testing.eox.at/ui-panels-cat/?catalog=cv-geodb-integrate&indicator=CV
@lubo can we have a short call about it to speed things up? Just tell me your availabilities and I'll send an outlook, thx
@lubojr By comparing the two jsons - countries.json (with only subAOI coordinates) and pois_trilateral.json (with both AOI and subAOI coordinates) with respect to the AOIs present in the COVID data, I found out that the countries.json actually contains three more countries, namely SS (South Sudan), XK (Kosovo) and AQ (Antarctica). Hence I'll extract the AOI coordinates info from poi_trilateral.json and the subAOI from countries.json. For SS and XK countries I'll add the AOI coordinates of the corresponding capitals, for AQ the first coordinate of the subAOI polygon.
@lubojr CV data reprocessed and reingested, could you check it please, thx
@dmoglioni Thank you for the update. Almost, but not there yet. The aoi_id must be set for all rows (we use it as unique identifier) and it can not be /
.
Currently for Namibia, there is aoi_id == '/'.
@lubojr AOI_ID for Namibia fixed, now everything should be in line.
Perfect. It works fine. Thank you!
@lubojr OW indicator is now operational on the geoDB under the data collection 'Global_COVID_vaccination_data'; could you please check if you can fetch all the information correctly?
@dmoglioni I have checked the data and our previous integration showed the total_vaccinations
, people_fully_vaccinated
and daily_vaccinations
- see https://github.com/eurodatacube/eodash-catalog/issues/30#issuecomment-1812232297.
The data in GeoDB have just measurement_value (daily vaccinations). Could you please add the total_vaccinations and people_fully_vaccinated into reference_value array? Thanks.
@lubojr I updated the collection with the specified reference values
Currently some datasets are fetched, parsed and transformed when updating the data. We would need to consider if this task can be integrated into the geoDB workflow or if other possibilities should be considered. The datasets in question are: