Open fairicube-data opened 3 months ago
@robknapen is there any webpage/documentation about the data?
I see it has 5 floating-point bands, but in your data request they are not described. According to the file metadata they correspond to different years?
BAND_1=2018
BAND_2=2019
BAND_3=2020
BAND_4=2021
BAND_5=2022
@misev The bands should represent different years indeed. @vittekm will know, he is pre-processing the data and filling in the data requests. From what I understood from him he is merging yearly data into a big file with bands per year, because it will be easier for the ingestion? Anyway, best to let him answer further questions :)
For us it's best to have separate files per year, but it's not a hard requirement as we can separate them before the ingest.
@misev In total we have 10 files in series "Agrodatacube" (ADC) to be ingested. Currently 6 files has 5 bands corresponding to years 2018 - 2022 and 4 files has 9 bands corresponding to years 2014 - 2022. My idea was then submit each file as different thematic dataset with multiple years as separate request. Would that work?
It sounds like this could be two datacubes: 2018-2022 with 6 bands, and 2014-2022 with 4 bands? So you could make two requests for datacubes ADC_2018_2022
and ADC_2014_2022
or so.
If possible it might be better to not hardcode the years in the datacube IDs, so they can be extended in future if needed for further years. Some other suffix for a distinguishing feature would be better in the name.
In fact it is 6 x 5 bands file and 4 x 9 bands file as following:
File Bands ADC_arable_land_markers_autumn 5 ADC_arable_land_markers_no_ndvi 5 ADC_arable_land_markers_spring 5 ADC_crop_rotation_index 9 ADC_crop_parcels_crop_code 9 ADC_crop_parcels_field_id 9 ADC_crop_parcels_land_use 9 ADC_grassland_markers_ndvi_spring 5 ADC_grassland_markers_no_mowing 5 ADC_grassland_markers_no_ndvi 5
Could it be then 10 datacubes?
It could be 10 datacubes, it will just be a bit of extra work to fill in 10 data requests in the catalog editor.
Alternatively if the 6 x 5 bands files are same resolutions and CRS, they could be a single datacube. This will save us adding so many different catalog entries. So there would be one datacube e.g. ADC_arrable_land_and_grassland_markers
with bands:
Or maybe 3 datacubes: ADC_arable_land
, ADC_crop
, ADC_grassland_markers
. Up to you, all these options are possible.
In that case we still need distinction between years:
Could that still fit within 2 or 3 requests just simply upload multiple files (with same number of bands reprsenting years)?
@vittekm do you create this data? I have some suggestions:
Let's start with one datacube ADC_arable_land_markers
which would contain data for years 2018-2022, and bands
Then you could zip all files needed to build this datacube and make that available for download.
What do you think?
@misev I see. It means that out of all individual multiband files should be created individual files representing years with tag (e.g. _2018.tif). Then zipped together and submit request for ingest to create a cube further. Actually, I thought that this separatio could be done easier (or quicker) after ingestion with data as they are. Otherwise I could do this preparation locally.
@misev I see. It means that out of all individual multiband files should be created individual files representing years with tag (e.g. _2018.tif).
Yes exactly, this is the usual way as TIFF is a 2D image format and putting multiple years in a single file is just surprising.
@vittekm @misev Maybe better to use slightly more descriptive names for the bands, than just 'autumn' and 'spring'? I think for arable field markers these represent categorical states (bare, green, unknown?). While for grasslands it indicates usage intensity in Spring, on a scale of [0.0 - 1.0]? But please double check.
@robknapen absolutely agreed, these details should be captured as completely as possible in the metadata entry in the catalog, describing what the pixel values represent for each band.
@misev @robknapen More descriptive name could be e.g. field conditions. But then bare, green, unknown are actual pixel values. Thes should be indeed filled in metadata entry. I wanted to include it in description but it's good to have indeed designated fields in catalog. Following then should be entered (in english) plus values of continuos variables:
bouwland_markers_najaar 1.tif Categories:(1='onbekend', 2='winter groen' and 3='winter kaal'), NoData=0, TimeExt=2018-2022 bouwland_markers_no_ndvi_img 1.tif NoData=-1, TimeExt=2018-2022, TimeExt=2018-2022 bouwland_markers_voorjaar 1.tif Categories:(1='onbekend', 2='winter groen' and 3='winter kaal'), NoData=0, TimeExt=2018-2022 crop_rotation_index 3.tif NoData=-1, TimeExt=2014-2022 gewaspercelen_crop_code.csv gewaspercelen_crop_code.tif Categories:(file above), NoData=-1, TimeExt=2014-2022 gewaspercelen_fieldid.tif NoData=-1, TimeExt=2014-2022 gewaspercelen_grondgebruik.tif Categories:(1='Bouwland', 2='Braakland', 3='Grasland', 4='Natuurterrein' and 5='Overige'), NoData=-1, TimeExt=2014-2022 grasland_markers_ndvi_voorjaar 1.tif NoData=-1, TimeExt=2018-2022 grasland_markers_no_maai 1.tif NoData=-1, TimeExt=2018-2022 grasland_markers_no_ndvi_img 1.tif NoData=-1, TimeExt=2018-2022
@vittekm yes sounds good, I'd suggest to just go ahead and update the data request in the catalog editor once you make the data available for download.
@vittekm to me it would be fine (for now) to leave out these two datasets:
They contain the number of NDVI "images" that were used to derive the field markers. It can give an expert that knows how the data is derived some guesstimate about the quality. To make it easier for non-expert users of the data I think we should pre-process it (make it more analysis ready) ourselves with some simple rules, e.g. if (no_ndvi_img < 2) then arable_land_autumn_condition = "unknown". The threshold value (e.g. 2) can be estimated by looking at the complete dataset, I don't know the actual range of these variables. It is easier if we do this than that a user has to figure it out (which probably nobody will put effort in).
{"filename": "ADC_arable_land_markers_autumn/ADC_arable_land_markers_autumn.json", "item_type": "stac_dist", "change_type": "Update", "user": "FAiRICUBE", "data_owner": true}