eurodatacube / eodash

Software behind the RACE dashboard by ESA and the European Commission (https://race.esa.int), the Green Transition Information Factory - GTIF (https://gtif.esa.int), as well as the Earth Observing Dashboard by NASA, ESA, and JAXA (https://eodashboard.org)
https://race.esa.int
MIT License
92 stars 42 forks source link

New Collection: DLR-WSF datasets #2321

Open AlessandroScremin opened 12 months ago

AlessandroScremin commented 12 months ago

Need to ingest in EDC the new dataset aout DLR WSF. The datasets are two:

THey can be downlaoded from the following website:

Verify the possibility of a stack api, or a simplified way to download the datasets Verify if they are in COG format or need some processing Ingest them in EDC as 2 new distinct collections.

Currently the only wasy to download htem is to download tile by tile.

AlessandroScremin commented 12 months ago

@aapopescu @santilland cc: @dmoglioni

I was able to creat a list of URL for all the tiles, so that with the wget command I can iterate thought the lines and download each tile. DOWNLOAD ON GOING

I verify with few tools and they seems already in COG. I need to test one igestion.

@lubojr @santilland If I ingest them as single tiles in EDC (it means one imege per tile) with the same date 2019-01-01 for all the tiles, are you then able to extract the statistics by compuitng stats on adiacent tiles?

Alessandro

santilland commented 11 months ago

@AlessandroScremin i think we have had another dataset that was composed from multiple files for a specific timestamp, do you know if this is the case? And if yes which layer it was? I think SH build some sort of time datacube? In any case it would for sure be good to check before everything is ingested, so maybe we could start with 2 tiles for 2 dates? So 4 files in total. What do you think

AlessandroScremin commented 11 months ago

@santilland cc: @aapopescu

the other dataset is:

ID: VIS_TRUCK_DETECTION_MOTORWAYS_NEW

Anyway the WSF is a global single date layer (split in multi tiles), and is a binary mask 0 or 255 so no need of timeseries or statistics.

I think is much simplier than expected.

Then i was able to download all the tiles from the website...so I have already ingested all the tiles. I will generate the layer for you in Eodashboard Active. As soon as I'm done i will give you the details of the IDs

AlessandroScremin commented 11 months ago

@santilland @lubojr cc: @aapopescu

I have added the layer for the DLR-WSF 2019.

The layer id: ID: DLR-WSF-2019 id: 3f9e2b45-e100-41e2-9c96-bb1b4f5e7ece

evalscript: //VERSION=3 function setup() { return { input: [{ bands: ["WSF2019", "dataMask"], // this sets which bands to use }], output: { // this defines the output image type bands: 4, sampleType: "UINT8" } }; }

function evaluatePixel(sample) {

var arr = colorBlend(sample.WSF2019, [255], [[0,0,0]]); if (sample.dataMask==1) arr.push(255); else arr.push(0); return arr; }

AlessandroScremin commented 11 months ago

@aapopescu @santilland @lubojr

We have few points to discuss about the WSF EVO

The file are structured in the following: 1) each tile has pixels values that range from 1985 to 2015 at step of 5. Each pixel value then correspond to a specific date of the timeserie. 2) Data type is INT32 that is not supported in SH (meed to be converted FLOT32 or UINT16.

So what should be the best approach from your side for 1) point for visualisation in the dashboard? I see two options:

Since the processing for option 1 could take quite long I would like to discuss with you what worth the most.

lubojr commented 11 months ago

@AlessandroScremin Thank you very much for the clear summary. I understand the time limitations of option 1 but from usage perspective, it would enable more flexibility to the possible users in terms of analysis. It would change to 30 boolean mask rasters, that a user could utilize in EDC directly. Regarding the actual usage inside dashboard for custom area indicator, there we could see "built-up progression" on time axis and potentially unlock some new insights when combining this with other datasets.

Potentially, we could implement the possibility to fetch more than one SH custom area indicators and use both of them on a chart with 2 different yAxis but this would be enhancement for the future.

AlessandroScremin commented 11 months ago

@lubojr @danielfdsilva cc: @aapopescu

as discussed with federico, here a sample of the WSF EVO layer.

I added thw following 2 layers -Visualisation: very simplified with only 3-4 years in colors.

DLR-WSF-EVO | ID: DLR-WSF-EVO |  

THen let me know which one is not needed if relevant...

both layers point to collection: db1ffd5a-9521-4679-80e8-9c92bd1782eb

THis layer is a little subdataset...only 4 tiles...the rest if the workflow works.

lubojr commented 11 months ago

@AlessandroScremin Perfect, thanks! We shall try to integrate yearly binning to create a "histogram" with bins for each year. This would mean that we can create a plot where we see for an area, how many pixels were added per year, so see a trend of construction in a region. I will let you know in this issue if to proceed with full ingestion.

lubojr commented 11 months ago

@AlessandroScremin Sorry to come back to this, but I order to test out the integration I would need the evalscript used for any of the two layers (I need to get the name of the band used in WSF-EVO layer).

For example in DLR-WSF-2019, it was WSF2019.

AlessandroScremin commented 11 months ago

THe I preparaed two simple evalscripts with very simple visaulisation and raw.


//VERSION=3 function setup() { return { input: [{ bands: ["WSFevolution", "dataMask"], // this sets which bands to use }], output: { // this defines the output image type bands: 4, sampleType: "UINT8" } }; }

function evaluatePixel(sample) {

var arr = colorBlend(sample.WSFevolution, [0,1989,1999,2005,2015], [[20, 7, 138,0],[124, 8, 169],[203, 71, 120],[246, 154, 67],[251, 250, 144] ]); if (sample.dataMask==1) arr.push(255); else arr.push(0); return arr; }



//VERSION=3

function setup() { return { input: ["WSFevolution", "dataMask"], output: { bands: 1, sampleType: "UINT16" } }; }

function evaluatePixel(sample){ return sample.WSFevolution; }

lubojr commented 11 months ago

@AlessandroScremin Sorry, to come back to this back late but just to double check - the final version of the dataset will be done as option 1) - therefore have tiles with sensing_time of the TILE being the year for which the pixel values of the original EVO dataset were of a single color? If yes, it should be fine.

The current 6 tiles that you have prepared, only have one sensing time (2015) https://radiantearth.github.io/stac-browser/#/external/eodashcatalog.eox.at/wsf-evo-test/RACE/world_settlement_footprint_evolution/collection.json : Temporal Extent 1/1/2015, 12:00:00 AM UTC so we can not try out the custom indicator integration. If we wanted to be 100% sure that the integration works and for us to prepare a mockup branch, we would need you to ingest at least 1 tile with another sensing year. (currently the generated chart would not have anything in it).

AlessandroScremin commented 11 months ago

@lubojr

No! the option one was the one we would prefer to avoid...as it is quite long, we need to reprocess and change the orginal data (is not only a matter of cog but also of extracting pixels values and create new images for each year and for each tile), and this is also disk volumes consuming.

The solution proposed and discussed at the meeting was to give you the dataset as it is in the radiant earth (one single file with all pixels associated per year), and on you to filter the pixels by year value. The idea was to create a chart with year in the x axes and number of pixels in the y axes. From the orginal image you shoud look for values corresponding to the different years and count the pxels with that year in order to show the evolution of built areas. I think it should be better to hear also Sinergize opinion in that...

lubojr commented 11 months ago

All clear, sorry I overlooked this comment before the meeting. We shall try to create a feature branch based on the currently available data.

santilland commented 7 months ago

@AlessandroScremin experimenting with the statistical api i think i will need the exact x and y resolution for the currently used projection. If i am not mistaken i need the (if we are using 4326) the longitude and the latitude "step size" for one pixel. Can you somehow extract or see this when loading the data in sentinelhub?

AlessandroScremin commented 7 months ago

The WSF EVO Pixel Size = (0.000269494585236,-0.000269494585236), EPSG 4326 (should be around 30x30 m as reported at the source, based on Landsat5-7)

WSF2019 Pixel Size = (0.000089831528412,-0.000089831528412), EPSG 4326 (should be around 10x10 m, based on S1+S2)

santilland commented 7 months ago

@AlessandroScremin thank you for the info, i am having quite some strange behavior from the sentinelhub statistical api, could you send me or link me to the original tile that covers rome? I need to be able to compare the original values to the ones provided by SH

AlessandroScremin commented 7 months ago

@AlessandroScremin thank you for the info, i am having quite some strange behavior from the sentinelhub statistical api, could you send me or link me to the original tile that covers rome? I need to be able to compare the original values to the ones provided by SH

@santilland WSF-EVO DLR-WSF-EVO/(BAND)_v1_10_42_COG.tif DLR-WSF-EVO/(BAND)_v1_12_42_COG.tif

(BAND)=WSFevolution

WSF2019 DLR-WSF-2019/(BAND)_v1_10_42.tif DLR-WSF-2019/(BAND)_v1_12_42.tif

santilland commented 7 months ago

@AlessandroScremin did you create overviews (convert to cog) before ingesting the data into SH? If yes, what resampling method did you apply? I think the issue is that something else than nearest neighbor was used, which might explain why there are some strange values in the data.

AlessandroScremin commented 6 months ago

@santilland @lubojr

Could you please check again the statistics API and let us know if now the values are normal? Aa said we ingested few reprocessed tiles...

let us now ASAP

Alessandro

santilland commented 6 months ago

@AlessandroScremin i am not getting any statistical data back, could you confirm what the latest data is and what area it covers? I tried to request db1ffd5a-9521-4679-80e8-9c92bd1782eb with the band WSFEvolution in an area more or less over Rome. This was providing the strange data before, now i get an empty response

AlessandroScremin commented 6 months ago

@santilland I'm a bit confused...the data are in SH, the gdalinfo provide me images wuth values...and looking at the SH EO-Browser...I got this...

image

so honestly I don't know what I can do...I simply reingested the new one instead of the old ones...so it should work...

gdal report me corretly the values..

gdalinfo /home/98373457-4e7e-4c85-8bb2-b806251a16de/.s3_2/DLR-WSF-EVO/WSFevolution_v1_12_40_COG.tif -mm Driver: GTiff/GeoTIFF Files: /home/98373457-4e7e-4c85-8bb2-b806251a16de/.s3_2/DLR-WSF-EVO/WSFevolution_v1_12_40_COG.tif Size is 7497, 7497 Coordinate System is: GEOGCRS["WGS 84", ENSEMBLE["World Geodetic System 1984 ensemble", MEMBER["World Geodetic System 1984 (Transit)"], MEMBER["World Geodetic System 1984 (G730)"], MEMBER["World Geodetic System 1984 (G873)"], MEMBER["World Geodetic System 1984 (G1150)"], MEMBER["World Geodetic System 1984 (G1674)"], MEMBER["World Geodetic System 1984 (G1762)"], MEMBER["World Geodetic System 1984 (G2139)"], ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]], ENSEMBLEACCURACY[2.0]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], CS[ellipsoidal,2], AXIS["geodetic latitude (Lat)",north, ORDER[1], ANGLEUNIT["degree",0.0174532925199433]], AXIS["geodetic longitude (Lon)",east, ORDER[2], ANGLEUNIT["degree",0.0174532925199433]], USAGE[ SCOPE["Horizontal component of 3D system."], AREA["World."], BBOX[-90,-180,90,180]], ID["EPSG",4326]] Data axis to CRS axis mapping: 2,1 Origin = (11.989814097143254,42.010163419491484) Pixel Size = (0.000269494585236,-0.000269494585236) Metadata: AREA_OR_POINT=Area Image Structure Metadata: COMPRESSION=DEFLATE INTERLEAVE=BAND LAYOUT=COG Corner Coordinates: Upper Left ( 11.9898141, 42.0101634) ( 11d59'23.33"E, 42d 0'36.59"N) Lower Left ( 11.9898141, 39.9897625) ( 11d59'23.33"E, 39d59'23.15"N) Upper Right ( 14.0102150, 42.0101634) ( 14d 0'36.77"E, 42d 0'36.59"N) Lower Right ( 14.0102150, 39.9897625) ( 14d 0'36.77"E, 39d59'23.15"N) Center ( 13.0000145, 40.9999630) ( 13d 0' 0.05"E, 40d59'59.87"N) Band 1 Block=512x512 Type=UInt16, ColorInterp=Gray Computed Min/Max=1985.000,2015.000 NoData Value=0 Overviews: 3749x3749, 1875x1875, 938x938, 469x469, 235x235

santilland commented 6 months ago

@AlessandroScremin, also not sure what could be the issue for the newly updated data. Did the BYOC collection identifier and the band name stay the same?

AlessandroScremin commented 6 months ago

Yes averything the same...

Alessandro Scremin Earth Observation Expert and Project Manager

RHEA Group Via di Grotta Portella 28, Palazzo Clorofilla Scala C 00044 Frascati, Italy Office: +39 06 9450 0077

www.rheagroup.comhttp://www.rheagroup.com/

[Graphical user interface, text Description automatically generated]

From: Daniel Santillan @.> Sent: Monday, February 26, 2024 2:56 PM To: eurodatacube/eodash @.> Cc: Alessandro Scremin @.>; Mention @.> Subject: Re: [eurodatacube/eodash] New Collection: DLR-WSF datasets (Issue #2321)

Caution: This email was sent from an external source, do not click links or open attachments unless you recognize the sender email address and know the content is safe.

@AlessandroScreminhttps://github.com/AlessandroScremin, also not sure what could be the issue for the newly updated data did the BYOC collection identifier and the band name stay the same?

- Reply to this email directly, view it on GitHubhttps://github.com/eurodatacube/eodash/issues/2321#issuecomment-1964206125, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APUQFH6A5VWWPMJDPFSCODDYVSIAFAVCNFSM6AAAAAA4ZXFJ6SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRUGIYDMMJSGU. You are receiving this because you were mentioned.Message ID: @.**@.>>

santilland commented 6 months ago

so, not sure why i was getting an empty answer yesterday, i retried and today it worked. It seems that the response is working as expected for the new dataset, so i think you are good to go with the ingestion.

Here is an example response (just as comment 1985 has the most results because it means anything build up to 1985):

{
    "data": [
        {
            "interval": {
                "from": "2015-01-01T00:00:00Z",
                "to": "2015-01-02T00:00:00Z"
            },
            "outputs": {
                "data": {
                    "bands": {
                        "B0": {
                            "stats": {
                                "min": 1985.0,
                                "max": 2015.0,
                                "mean": 1991.2393364928892,
                                "stDev": 9.412458595443752,
                                "sampleCount": 2880,
                                "noDataCount": 2036
                            },
                            "histogram": {
                                "bins": [
                                    {
                                        "lowEdge": 1985,
                                        "highEdge": 1986,
                                        "count": 504
                                    },
                                    {
                                        "lowEdge": 1986,
                                        "highEdge": 1987,
                                        "count": 5
                                    },
                                    {
                                        "lowEdge": 1987,
                                        "highEdge": 1988,
                                        "count": 12
                                    },
                                    {
                                        "lowEdge": 1988,
                                        "highEdge": 1989,
                                        "count": 13
                                    },
                                    {
                                        "lowEdge": 1989,
                                        "highEdge": 1990,
                                        "count": 13
                                    },
                                    {
                                        "lowEdge": 1990,
                                        "highEdge": 1991,
                                        "count": 19
                                    },
                                    {
                                        "lowEdge": 1991,
                                        "highEdge": 1992,
                                        "count": 10
                                    },
                                    {
                                        "lowEdge": 1992,
                                        "highEdge": 1993,
                                        "count": 18
                                    },
                                    {
                                        "lowEdge": 1993,
                                        "highEdge": 1994,
                                        "count": 15
                                    },
                                    {
                                        "lowEdge": 1994,
                                        "highEdge": 1995,
                                        "count": 1
                                    },
                                    {
                                        "lowEdge": 1995,
                                        "highEdge": 1996,
                                        "count": 12
                                    },
                                    {
                                        "lowEdge": 1996,
                                        "highEdge": 1997,
                                        "count": 10
                                    },
                                    {
                                        "lowEdge": 1997,
                                        "highEdge": 1998,
                                        "count": 9
                                    },
                                    {
                                        "lowEdge": 1998,
                                        "highEdge": 1999,
                                        "count": 15
                                    },
                                    {
                                        "lowEdge": 1999,
                                        "highEdge": 2000,
                                        "count": 6
                                    },
                                    {
                                        "lowEdge": 2000,
                                        "highEdge": 2001,
                                        "count": 10
                                    },
                                    {
                                        "lowEdge": 2001,
                                        "highEdge": 2002,
                                        "count": 16
                                    },
                                    {
                                        "lowEdge": 2002,
                                        "highEdge": 2003,
                                        "count": 7
                                    },
                                    {
                                        "lowEdge": 2003,
                                        "highEdge": 2004,
                                        "count": 11
                                    },
                                    {
                                        "lowEdge": 2004,
                                        "highEdge": 2005,
                                        "count": 11
                                    },
                                    {
                                        "lowEdge": 2005,
                                        "highEdge": 2006,
                                        "count": 10
                                    },
                                    {
                                        "lowEdge": 2006,
                                        "highEdge": 2007,
                                        "count": 13
                                    },
                                    {
                                        "lowEdge": 2007,
                                        "highEdge": 2008,
                                        "count": 12
                                    },
                                    {
                                        "lowEdge": 2008,
                                        "highEdge": 2009,
                                        "count": 6
                                    },
                                    {
                                        "lowEdge": 2009,
                                        "highEdge": 2010,
                                        "count": 13
                                    },
                                    {
                                        "lowEdge": 2010,
                                        "highEdge": 2011,
                                        "count": 10
                                    },
                                    {
                                        "lowEdge": 2011,
                                        "highEdge": 2012,
                                        "count": 15
                                    },
                                    {
                                        "lowEdge": 2012,
                                        "highEdge": 2013,
                                        "count": 12
                                    },
                                    {
                                        "lowEdge": 2013,
                                        "highEdge": 2014,
                                        "count": 14
                                    },
                                    {
                                        "lowEdge": 2014,
                                        "highEdge": 2015,
                                        "count": 22
                                    }
                                ],
                                "overflowCount": 0,
                                "underflowCount": 0
                            }
                        }
                    }
                }
            }
        }
    ],
    "status": "OK"
}
santilland commented 6 months ago

New datasets info: ID: DLR-WSF-2019 - 3f9e2b45-e100-41e2-9c96-bb1b4f5e7ece

//VERSION=3 function setup() { return { input: [{ bands: ["WSF2019", "dataMask"], // this sets which bands to use }], output: { // this defines the output image type bands: 4, sampleType: "UINT8" } }; }

function evaluatePixel(sample) {

var arr = colorBlend(sample.WSF2019, [255], [[0,0,0,0]]); if (sample.dataMask==1) arr.push(255); else arr.push(0); return arr; }

ID: DLR-WSF-EVO-1985-2015 - 5bc188f0-97d0-4db1-aa7f-4f6ba4baf80b //VERSION=3 function setup() { return { input: [{ bands: ["WSFevolution", "dataMask"], // this sets which bands to use }], output: { // this defines the output image type bands: 4, sampleType: "UINT8" } }; }

function evaluatePixel(sample) {

var arr = colorBlend(sample.WSFevolution, [0,1989,1999,2005,2015], [[20, 7, 138,0],[124, 8, 169],[203, 71, 120],[246, 154, 67],[251, 250, 144] ]); if (sample.dataMask==1) arr.push(255); else arr.push(0); return arr; }

ID: RAW-DLR-WSF-EVO-1985-2015 - 5bc188f0-97d0-4db1-aa7f-4f6ba4baf80b //VERSION=3

function setup() { return { input: ["WSFevolution", "dataMask"], output: { bands: 1, sampleType: "UINT16" } }; }

function evaluatePixel(sample){   return sample.WSFevolution; }

santilland commented 6 months ago

@santilland @lubojr have a look at integration, should be ready to go

aapopescu commented 1 month ago

Hi, @lubojr , any info missing here?

lubojr commented 1 month ago

No, we are not missing any info, it is on our end. We need to implement the different handling of the chart for this dataset.