Closed mingardiluca closed 1 year ago
Hi @mingardiluca
Your question would probably best be asked at the GEE forums, but I'll have a look and try to answer in the next few days.
I've tried rerunning this using sh-py (using bbox from your tiffs, and requesting dates 2022-09-21..2022-09-22
, and I get this:
Comparing your tiff data with data downloaded from SH, I see your data is skewed (see below). Did you perhaps download data using some scaling factor?
I'm attaching a notebook I've used; hope it helps. s2cloudless_gee.ipynb.zip
Just to add: if the bands that are input to s2cloudless are scaled, then your results are expected. The input to s2cloudless should be raw data from Sentinel-2 L1C.
Hi @batic! First of all, thank you very much for your quick answer.
I've been investigating what you highlighted, by inspecting how I get the data from the beginning of my pipeline. I don't get the data from SentinelHubInputTask, I retrieve the data from Google Cloud, in this case here&prefix=&forceOnObjectsSortingFiltering=true). I found online that after January 25th 2022, the bands have been shifted by 1000 (hence your plot of distributions), according to this and this: "After 2022-01-25, Sentinel-2 scenes with PROCESSING_BASELINE '04.00' or above have their DN (value) range shifted by 1000. The HARMONIZED collection shifts data in newer scenes to be in the same range as in older scenes.". Apparently this shift is taken care of directly by SentinelHub and GEE, but I have to modify it manually in order to have the same results as you (image below). This is my output when removing 1000 from each band
This is a screenshot from the notebook you sent over
The results are very similar, and in line with what I saw in GEE. Does it makes sense to proceed as I explained (for data retrieved after 25-01-2022, remove 1000 for each band), or something additional is needed?
Thank you very much
s2cloudless model has been trained on data prior to the changes in processing baseline from ESA, so the correct way is to make the input to the model exactly like before. The Sentinel-Hub service does that for you, see also this post, I am not sure how you need to apply this in GEE, but your thinking is correct.
In addition (again, see the post above), you should clamp the negative values to 0, as the model never saw negative values as input.
If that answers your questions, please close the issue.
I am trying to compute the cloud probability of a territory in Ivory Coast in python using s2cloudless, and the result that I get is different than that using the snippet of code found here. The only modifications that I made to the above script are: var START_DATE = ee.Date('2022-09-21'); var END_DATE = ee.Date('2022-09-22'); # The date of interest is September 21st 2022 var MAX_CLOUD_PROBABILITY = 101; # this is to show the entirity of the original image (clouds included) var region = ee.Geometry.Rectangle({coords: [-4.310415, 6.037586, -4.211883, 6.136218], geodesic: false}); # Area of interest Map.setCenter(-4.25, 6.1, 12);
In python, basically the entire image is predicted to have a probability of clouds of about 100%; in GEE, such probability is much lower for some areas of the image. My understanding is that GEE uses the s2cloudless algorithm for the computed probabilities in 'COPERNICUS/S2_CLOUD_PROBABILITY', so i don't understand why the result would be different when computed in python.
Thank you
Files in the zip file:
tiffs
is the folder containing the tiffs that I dowloaded in python from the Sentinel APIcode.py
is a snipped of code to replicate my resultspython.png
is the image resulting from the tiffs, collected in pythonGEE_thresh_101.png
is a screenshot of GEE when MAX_CLOUD_PROBABILITY is set to 101GEE_thresh_30.png
is a screenshot of GEE when MAX_CLOUD_PROBABILITY is set to 30 (you can see that there are some parts of the image that haven't been removed, meaning that the cloud probability is less than 30%)image.pdf
has on the left the python image and on the right the cloud probability, computed in pythonexperiment.zip