kappazeta / km_predict

S2 full image prediction
Apache License 2.0
20 stars 9 forks source link

New docker image has no conda installed, fix km_s3 script #27

Closed thierryweo closed 4 months ago

thierryweo commented 4 months ago

The change to a new Dockerbase image from this commit https://github.com/kappazeta/km_predict/commit/d72d533fdcba97a578295c6c13590ff1541354b7 breaks the script km_s3.sh as it used conda before. Conda is no longer available.

This PR executes python3 directly and does not require the conda version of python.

This fixes #26

indrek-sunter commented 4 months ago

Sorry, I missed it. The pull request looks good. Thank you for the contribution!

thierryweo commented 4 months ago

@indrek-sunter Thank you very much!

thierryweo commented 4 months ago

@indrek-sunter Thanks for merging in and hope this helps. I was wondering if the docker images on docker hub will also get updated automatically? Thanks again

indrek-sunter commented 4 months ago

@thierryweo Docker builds are currently manual. I pushed kappazeta/kappamask:v2.2. Can you check if it works any better? Also, I noticed that the km_predict version did not match that of the Docker image. I increased km_predict version to 2.2.0.

thierryweo commented 4 months ago

@indrek-sunter Hi there,

So it seems that the latest fixes did not work really. There is some path problems that i tried to fix in the s3script. the /data directory is not used everywhere. I can fix that. However when i continue to run it the inference seems to start fine but then i get an error as follows:

` Trainable params: 24552174 (93.66 MB) Non-trainable params: 49504 (193.38 KB)


/opt/conda/envs/py311a/lib/python3.11/site-packages/rasterio/init.py:304: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs) Traceback (most recent call last): File "/home/km_predict/km_predict.py", line 437, in main() File "/home/km_predict/km_predict.py", line 434, in main kmf.mosaic() File "/home/km_predict/km_predict.py", line 383, in mosaic tif_img.save(tif_name, tiffinfo=tif_img.tag) File "/opt/conda/envs/py311a/lib/python3.11/site-packages/PIL/Image.py", line 2459, in save save_handler(self, fp, filename) File "/opt/conda/envs/py311a/lib/python3.11/site-packages/PIL/TiffImagePlugin.py", line 1888, in _save offset = ifd.save(fp) ^^^^^^^^^^^^ File "/opt/conda/envs/py311a/lib/python3.11/site-packages/PIL/TiffImagePlugin.py", line 976, in save result = self.tobytes(offset) ^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/envs/py311a/lib/python3.11/site-packages/PIL/TiffImagePlugin.py", line 950, in tobytes raise NotImplementedError(msg) NotImplementedError: multistrip support not yet implemented `

I tried to install a few dependencies mentionned on the PILLOW project site but this does not seem to have any effect.

I reverted back to the v2.0 for now which works fine. On another site sometimes there are Sentinel products that are not found on sentinelhub and so the download fails, is there any where i can resolve this problem?

But thanks so nothing urgent on my side for now

indrek-sunter commented 3 months ago

Oh .. I will look into the remaining issues with the data directory and Pillow later.

While there shouldn't be significant differences in functionality between v2.0 and v2.2, both have issues with semi-transparent cloud class and S2 products starting from 2022. We would like to get both issues resolved for v2.3. https://github.com/kappazeta/km_predict/issues/28

indrek-sunter commented 3 months ago

@thierryweo Currently it uses products from AWS S3 buckets of Sinergise: https://registry.opendata.aws/sentinel-2/

They state that "New Sentinel data are added regularly, usually within few hours after they are available on Copernicus OpenHub."

However, you could add your own scripts to the Docker base image which would also check for availability from other sources.

indrek-sunter commented 3 months ago

@thierryweo We pushed v2.3 which should also resolve the issue with semi-transparent cloud class on S2 products from 2022 and onward. The Pillow version was downgraded from 10.2.0 to 10.1.0. km_s3.sh should now use /data.

thierryweo commented 3 months ago

@indrek-sunter Wow thanks. Okay downgrading pillow helped. Thanks for pushing a new image I will try it out this week and also thanks for making me aware of the semi-transparent cloud issue.

Thank you for the help and work. Much appreciated