KappaMask, or km-predict, is a cloud detector developed by KappaZeta LTD for Sentinel-2 Level-1C and Level-2A input products. The project was funded by European Space Agency, Contract No. 4000132124/20/I-DT.
Currently, KappaMask outputs results in TIFF and PNG formats. Each pixel is classified as one of the following classes: Class | TIFF | PNG | Description |
---|---|---|---|
Clear | 1 | 66 | Pixels without clouds or cloud shadows. |
Cloud shadow | 2 | 129 | Pixels with cloud shadows. |
Semi-transparent | 3 | 192 | Pixels with thin clouds through which the land is visible; include cirrus clouds that are on the high cloud level (5-15km). |
Cloud | 4 | 255 | Pixels with cloud; include stratus and cumulus clouds that are on the low cloud level (from 0-0.2km to 2km). |
Missing | 5 | 20 | Missing or invalid pixels. |
KappaMask has been trained and validated with the following dataset:
Related publications:
The following system dependencies are needed:
Due to the long environment solve times with Miniconda, we have switched to Micromamba. If you're still using Conda, Miniconda or similar, simply substitute micromamba
with conda
in the relevant commands below.
Create a micromamba environment.
micromamba create -f environment.yml
Copy config/config_example.json
and adapt it to your needs.
In order to run sub-tiling procedure cm_vsm should be installed (https://github.com/kappazeta/cm-vsm).
Make sure that your GDAL_DATA
environment variable has been set, according to your GDAL version instead of the placeholder YOUR_GDAL_VERSION
below:
GDAL_DATA=/usr/share/gdal/YOUR_GDAL_VERSION
In the root of repository create a /data
folder and copy or symlink the .SAFE product into it.
Cloudmask inference can be run as follows:
micromamba activate km_predict
python km_predict.py -c config/your_config.json
It is possible to overwrite product_name in config file with command line argument -product
python km_predict.py -c config/your_config.json -product S2B_MSIL2A_20200401T093029_N0214_R136_T34UFA_20200401T122148
If the prediction for the same product is running multiple times and .CVAT folder is created under /data
folder, it might be convenient to disable sub_tiling procedure for the next run by -t
python km_predict.py -c config/your_config.json -product S2B_MSIL2A_20200401T093029_N0214_R136_T34UFA_20200401T122148 -t
KappaMask tests can be run from the root of the working copy of the repository as follows:
micromamba activate km_predict
pytest
By default, the KappaMask Docker image runs the km_s3
entrypoint, which expects a Sentinel-2 product title and output path in an S3 bucket.
The entrypoint performs the following operations:
KappaMask can be run as a Docker container as follows:
Pull the image
docker pull kappazeta/kappamask:v2.3
Run KappaMask for a specific Sentinel-2 product on AWS (please make sure to replace YOUR-AWS-REGION
, YOUR-AWS-ACCESS-KEY
, YOUR-AWS-SECRET-KEY
, YOUR-S3-BUCKET
with your AWS configuration, and YOUR-S2-PRODUCT-NAME
with the name of the product to process)
docker run -e AWS_REGION=YOUR-AWS-REGION -e AWS_ACCESS_KEY=YOUR-AWS-ACCESS-KEY -e AWS_SECRET_KEY=YOUR-AWS-SECRET-KEY kappazeta/kappamask:v2.3 YOUR-S2-PRODUCT-NAME s3://YOUR-S3-BUCKET/
For example:
docker pull kappazeta/kappamask:v2.0
docker run -e AWS_REGION=eu-central-1 -e AWS_ACCESS_KEY=A******************F -e AWS_SECRET_KEY=3**************************************I kappazeta/kappamask:v2.3 S2A_MSIL2A_20200509T094041_N0214_R036_T35VME_20200509T111504 s3://my-kappamask-experiments/output/
KappaMask Docker image can be run locally with the km_local
entrypoint.
The entrypoint performs the following operations:
/data
volume, or attempt to decompress a .zip file in the volume otherwise.KappaMask can be run as a Docker container as follows:
Pull the image
docker pull kappazeta/kappamask:v2.3
Run KappaMask for a specific Sentinel-2 product locally
docker run -v /YOUR-LOCAL-DATA-DIR/:/data kappazeta/kappamask:v2.3 YOUR-S2-PRODUCT-NAME
For example:
docker pull kappazeta/kappamask:v2.3
docker run -v /home/kappazeta/Documents/data/cloudmask_data/:/data kappazeta/kappamask:v2.3 S2A_MSIL2A_20200509T094041_N0214_R036_T35VME_20200509T111504
KappaMask can be tested as a Docker container as follows:
Pull the image
docker pull kappazeta/kappamask:v2.3
Run KappaMask tests:
docker run --entrypoint km_test kappazeta/kappamask:v2.3
The predictor will generate sub-tiles masks under /prediction
folder and full S2 mask under /big_image
folder
Potential solutions for typical issues encountered during setup or usage.
Sentinel-2 product splitting fails with the following messages:
INFO: KMP.P: Extracting geo-coordinates.
ERROR 4: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.
ERROR 4: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.
INFO: KMP.P: Projection:
terminate called after throwing an instance of 'INFO: KMP.P: Projecting AOI polygon into pixel coordinates.
GDALOGRException'
what(): GDAL OGR error : Failed to import spatial reference from EPSG, Generic failure
Magick: abort due to signal 6 (SIGABRT) "Abort"...
This indicates that the environment variable GDAL_DATA
has not been configured correctly. This could be done in a variety of ways and the preferred method depends on your linux distribution. An export call for the variable (for example, GDAL_DATA=/usr/share/gdal/2.2
) could be added to your .bashrc
, .profile
, etc. Alternatively, the variable could be set together with the python call, for example:
GDAL_DATA=/usr/share/gdal/2.2 python km_predict.py -c config/your_config.json
Sentinel-2 product splitting fails with the following messages:
terminate called after throwing an instance of 'std::filesystem::__cxx11::filesystem_error'
what(): filesystem error: directory iterator cannot open directory: No such file or directory [YOUR_DIRECTORY/km_predict/data/S2B_MSIL1C_20200401T093029_N0209_R136_T34UFA_20200401T113334.SAFE.SAFE/GRANULE/]
Magick: abort due to signal 6 (SIGABRT) "Abort"...
This means that km_predict cannot find the directory with the product_name
specified in the configuration file. The product name in the configuration file should be provided without the .SAFE
suffix.