ESA-PhiLab / OpenSarToolkit

High-level functionality for the inventory, download and pre-processing of Sentinel-1 data in the python language.
https://opensartoolkit.readthedocs.io/en/latest/
MIT License
206 stars 52 forks source link

docker build fails #44

Open MNEUW opened 3 years ago

MNEUW commented 3 years ago

Dear ost-team,

I tried to pull the docker image. Anyway by using docker pull buddyvolly/opensartoolkit still an old ost-version is installed. now I fail building the new version using the image here in this repository. By using docker build I always get the following errer-log at step 8/14: `docker build /home/ost/OpenSarToolkit_master_20201220/ Sending build context to Docker daemon 54.87MB Step 1/14 : FROM ubuntu:18.04 ---> 2c047404e52d Step 2/14 : LABEL maintainer="Petr Sevcik, EOX" ---> Using cache ---> e76d4f005833 Step 3/14 : LABEL OpenSARToolkit='0.10.1' ---> Using cache ---> 3962f459ecd1 Step 4/14 : WORKDIR /home/ost ---> Using cache ---> 693506450e9e Step 5/14 : COPY snap7.varfile $HOME ---> Using cache ---> a9f658bc3602 Step 6/14 : ENV OTB_VERSION="7.1.0" TBX_VERSION="7" TBX_SUBVERSION="0" ---> Using cache ---> 76a4ec29a32c Step 7/14 : ENV TBX="esa-snap_sentinelunix${TBXVERSION}${TBX_SUBVERSION}.sh" SNAP_URL="http://step.esa.int/downloads/${TBX_VERSION}.${TBX_SUBVERSION}/installers" OTB=OTB-${OTB_VERSION}-Linux64.run HOME=/home/ost PATH=$PATH:/home/ost/programs/snap/bin:/home/ost/programs/OTB-${OTB_VERSION}-Linux64/bin ---> Using cache ---> c642dd2a2adc Step 8/14 : RUN groupadd -r ost && useradd -r -g ost ost && apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -yq python3 python3-pip git libgdal-dev python3-gdal libspatialindex-dev libgfortran3 wget unzip imagemagick nodejs npm ---> Running in d4c18ae031c0 Get:1 http://archive.ubuntu.com/ubuntu bionic InRelease [242 kB] Get:2 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB] Get:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB] Get:4 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB] Get:5 http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages [1344 kB] Get:6 http://archive.ubuntu.com/ubuntu bionic/multiverse amd64 Packages [186 kB] Get:7 http://archive.ubuntu.com/ubuntu bionic/universe amd64 Packages [11.3 MB] Get:8 http://archive.ubuntu.com/ubuntu bionic/restricted amd64 Packages [13.5 kB] Get:9 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [53.8 kB] Get:10 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [266 kB] Get:11 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [2136 kB] Get:12 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [2244 kB] Get:13 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [11.4 kB] Get:14 http://archive.ubuntu.com/ubuntu bionic-backports/main amd64 Packages [11.3 kB] Get:15 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [237 kB] Get:16 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [1816 kB] Get:17 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1372 kB] Get:18 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [15.3 kB] Fetched 21.5 MB in 2s (9791 kB/s) Reading package lists... Reading package lists... Building dependency tree... Reading state information... Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation:

The following packages have unmet dependencies: libgdal-dev : Depends: default-libmysqlclient-dev but it is not going to be installed E: Unable to correct problems, you have held broken packages. The command '/bin/sh -c groupadd -r ost && useradd -r -g ost ost && apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -yq python3 python3-pip git libgdal-dev python3-gdal libspatialindex-dev libgfortran3 wget unzip imagemagick nodejs npm' returned a non-zero code: 100 `

may you have a hint what I'm doing wrong. best M

BuddyVolly commented 3 years ago

Hi Mneuw,

sorry for running into this. You are doing nothing wrong. Just usual clash of dependencies, especially when you are working with gdal...there are some dependency issues between gdal and nodejs/npm. Unfortunately I am not having too much time to look into this right now. Could you check if removing the gdal related stuff in the 8th RUN solves this? i.e. delete those 2 entries: libgdal-dev python3-gdal

gdal should later then be automatically installed via pip (so no additional entry needed).

Please let me know how that goes. And I hope that solves it. Cheers

MNEUW commented 3 years ago

Finally got it to run:)

needed to modify the dockerfile quite a bit:

  1. updated to OTB Version 7.2.0 because wget couldn't find the old one? you think this could cause problems?
  2. removed libgdal-dev but needed python3-gdal
  3. added a RUN to add nodejs >= v.10 (was required for installation)
  4. added pip update > maybe this is already newest version and not necessary
  5. removed last line jupyter nbextension enable --py widgetsnbextension because of an error. are these extensions important?

FROM ubuntu:18.04

LABEL maintainer="Petr Sevcik, EOX" LABEL OpenSARToolkit='0.10.1'

set work directory to home and download snap WORKDIR /home/ost

copy the snap installation config file into the container COPY snap7.varfile $HOME

update variables ENV OTB_VERSION="7.2.0" \ TBX_VERSION="7" \ TBX_SUBVERSION="0" ENV TBX="esa-snap_sentinelunix${TBXVERSION}${TBX_SUBVERSION}.sh" \ SNAP_URL="http://step.esa.int/downloads/${TBX_VERSION}.${TBX_SUBVERSION}/installers" \ OTB=OTB-${OTB_VERSION}-Linux64.run \ HOME=/home/ost \ PATH=$PATH:/home/ost/programs/snap/bin:/home/ost/programs/OTB-${OTB_VERSION}-Linux64/bin

install all dependencies RUN groupadd -r ost && \ useradd -r -g ost ost && \ apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -yq \ python3 \ python3-pip \ git \ python3-gdal \ libspatialindex-dev \ libgfortran3 \ wget \ unzip \ imagemagick \ npm \ curl

RUN apt-get update && \ curl -sL https://deb.nodesource.com/setup_10.x | bash - && \ apt-get -y install nodejs &&\ ln -s /usr/bin/nodejs /usr/local/bin/node

RUN alias python=python3 && \ rm -rf /var/lib/apt/lists/* && \ python3 -m pip install -U pip && \ python3 -m pip install -U setuptools && \ python3 -m pip install jupyterlab && \ mkdir /home/ost/programs && \ wget $SNAP_URL/$TBX && \
chmod +x $TBX && \ ./$TBX -q -varfile snap7.varfile && \ rm $TBX && \ rm snap7.varfile && \ cd /home/ost/programs && \ wget https://www.orfeo-toolbox.org/packages/${OTB} && \ chmod +x $OTB && \ ./${OTB} && \ rm -f OTB-${OTB_VERSION}-Linux64.run

update snap to latest version RUN /home/ost/programs/snap/bin/snap --nosplash --nogui --modules --update-all 2>&1 | while read -r line; do \ echo "$line" && \ [ "$line" = "updates=0" ] && sleep 2 && pkill -TERM -f "snap/jre/bin/java"; \ done; exit 0

set usable memory to 12G RUN echo "-Xmx36G" > /home/ost/programs/snap/bin/gpt.vmoptions

get OST and tutorials RUN python3 -m pip install git+https://github.com/ESA-PhiLab/OpenSarToolkit.git && \ git clone https://github.com/ESA-PhiLab/OST_Notebooks && \ jupyter labextension install @jupyter-widgets/jupyterlab-manager

EXPOSE 8888 CMD jupyter lab --ip='0.0.0.0' --port=8888 --no-browser --allow-root

best M

BuddyVolly commented 3 years ago

Hi,

responding to the numbers above:

  1. not at all. Orfeo is needed during mosaicking. So 7.2. is fine. it needs to ship with otbcli_Mosaic function (which it does) 2.ok
  2. ok
  3. shouldn't do a harm
  4. ok, you might have visualization issues of the progress bar during download. but the download itself should still work
MNEUW commented 3 years ago

Thanks a lot, all clear now for me!

BuddyVolly commented 3 years ago

great, let us know if there are further issues and have a great christmas time.

MNEUW commented 3 years ago

ok!! I just tried the S1GRD_Batch-processing and encountered that the grds_to_ards-module doesn't work for the RTC-Gamma0 product type. For some reason it produces only NA-layers. Contrary it runs perfectly for the GTC-Gamma product. And I was wondering if and how it is possible to change to out_projection for the ard-products e.g. to WGS 1984 UTM Zone 33N?

Anyway it's a great toolkit! wish you a pleasent christmas 2020!!

Best Martin

BuddyVolly commented 3 years ago

hmm, in the docker file there is still the update of snap . uncomment following lines and build again. There are some issues with the latest version of snap that I still need to fix (when I have time)

RUN /home/ost/programs/snap/bin/snap --nosplash --nogui --modules --update-all 2>&1 | while read -r line; do echo "$line" && [ "$line" = "updates=0" ] && sleep 2 && pkill -TERM -f "snap/jre/bin/java"; done; exit 0

BuddyVolly commented 3 years ago

then try to use this notebook as template: https://github.com/ESA-PhiLab/OST_Notebooks/blob/master/4c%20-%20Sentinel-1%20GRD%20Batch%20-%20Timescan.ipynb

you need to change aoi, start,end and project_dir of course for me this works.

Check out also the other notebooks. Should not take you more than a day to go through, and will hopefully help to get the concept. Most trickiest parts are the refine inventory and the ard parameters. the rest is just code execution

happy christmas to you, too!

BuddyVolly commented 3 years ago

and yes, you can set the out projection by using the EPSG code. so Lat/Lon is 4326, UTM 33N is 32633

e.g. s1.ard_parameters['single_ARD']['dem']['out_projection'] = '32633'

MNEUW commented 3 years ago

Now after getting back to SNAP version 7 I'm running into the following error while executing grds_to_ards in your above shared notebook:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-26-b95cd587dd24> in <module>
      3     timeseries=True,
      4     timescan=True,
----> 5     overwrite=False
      6 )

/usr/local/lib/python3.6/dist-packages/ost/Project.py in grds_to_ards(self, inventory_df, timeseries, timescan, mosaic, overwrite, max_workers, executor_type)
    848         # time-series part
    849         if timeseries or timescan:
--> 850             grd_batch.ards_to_timeseries(inventory_df, self.config_file)
    851 
    852         if timescan:

/usr/local/lib/python3.6/dist-packages/ost/s1/grd_batch.py in ards_to_timeseries(inventory_df, config_file)
    146 
    147     # create all extents
--> 148     _create_extents(inventory_df, config_file)
    149 
    150     # update extents in case of ls_mask

/usr/local/lib/python3.6/dist-packages/ost/s1/grd_batch.py in _create_extents(inventory_df, config_file)
    186             fargs=([str(config_file), ])
    187     ):
--> 188         track, list_of_scenes, extent = task.result()
    189         out_dict['track'].append(track)
    190         out_dict['list_of_scenes'].append(list_of_scenes)

/usr/local/lib/python3.6/dist-packages/godale/_concurrent.py in result(self)
    154         """Return func result or raise Exception."""
    155         if self._exception:
--> 156             raise self._exception
    157         else:
    158             return self._result

AttributeError: 'DataFrame' object has no attribute 'crs'

I changed the crs of the inventory_pandas_dataframe o the same as the output projection with no effect.First results (ard products in snap-format) appear in the processing directory but timeseries and timescan products fails.

may you have an idea where this could come from? steps before while querring the S1-datasets I already encounter some crs-warnings: ` INFO (10:00:32): Created project directory at /home/ost/shared INFO (10:00:32): Downloaded data will be stored in: /home/ost/shared/download. INFO (10:00:32): Inventory files will be stored in: /home/ost/shared/inventory. INFO (10:00:32): Processed data will be stored in: /home/ost/shared/processing. INFO (10:00:32): Using /home/ost/shared/temp as directory for temporary files.

/usr/local/lib/python3.6/dist-packages/pyproj/crs/crs.py:53: FutureWarning: '+init=:' syntax is deprecated. ':' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6 return _prepare_from_string(" ".join(pjargs)) /usr/local/lib/python3.6/dist-packages/pyproj/crs/crs.py:294: FutureWarning: '+init=:' syntax is deprecated. ':' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6 projstring = _prepare_from_string(" ".join((projstring, projkwargs))) /usr/local/lib/python3.6/dist-packages/geopandas/geodataframe.py:422: RuntimeWarning: Sequential read of iterator was interrupted. Resetting iterator. This can negatively impact the performance. for feature in features_lst:

If you do not have a Copernicus Scihub user account go to: https://scihub.copernicus.eu and register

Your Copernicus Scihub Username: madduen Your Copernicus Scihub Password: ·········

/usr/local/lib/python3.6/dist-packages/pyproj/crs/crs.py:53: FutureWarning: '+init=:' syntax is deprecated. ':' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6 return _prepare_from_string(" ".join(pjargs))

INFO (10:00:42): Writing inventory data to geopackage file: /home/ost/shared/inventory/full.inventory.gpkg INFO (10:00:42): Coverage analysis for ASCENDING tracks in VV VH polarisation. INFO (10:00:42): 5 frames for ASCENDING tracks in VV VH polarisation. INFO (10:00:42): 5 frames remain after double entry removal INFO (10:00:42): Excluding track 146 INFO (10:00:42): 2 frames remain after non-AOI overlap INFO (10:00:42): All remaining tracks fully overlap the AOI. Not removing anything. INFO (10:00:42): 2 frames remain after removal of non-full AOI crossing

/usr/local/lib/python3.6/dist-packages/pyproj/crs/crs.py:53: FutureWarning: '+init=:' syntax is deprecated. ':' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6 return _prepare_from_string(" ".join(pjargs)) /usr/local/lib/python3.6/dist-packages/ost/s1/refine_inventory.py:71: UserWarning: CRS mismatch between the CRS of left geometries and the CRS of right geometries. Use to_crs() to reproject one of the input geometries to match the CRS of the other.

Left CRS: EPSG:4326 Right CRS: +init=epsg:4326 +no_defs +type=crs

inventory_df, aoi_gdf, how='left', op='intersects'

INFO (10:00:43): Found 2 full coverage mosaics. INFO (10:00:43): Coverage analysis for DESCENDING tracks in VV VH polarisation. INFO (10:00:43): 6 frames for DESCENDING tracks in VV VH polarisation. INFO (10:00:43): 6 frames remain after double entry removal INFO (10:00:43): Excluding track 124 INFO (10:00:43): Excluding track 22 INFO (10:00:43): 3 frames remain after non-AOI overlap INFO (10:00:43): All remaining tracks fully overlap the AOI. Not removing anything. INFO (10:00:43): 3 frames remain after removal of non-full AOI crossing

/usr/local/lib/python3.6/dist-packages/ost/s1/refine_inventory.py:71: UserWarning: CRS mismatch between the CRS of left geometries and the CRS of right geometries. Use to_crs() to reproject one of the input geometries to match the CRS of the other.

Left CRS: EPSG:4326 Right CRS: +init=epsg:4326 +no_defs +type=crs

inventory_df, aoi_gdf, how='left', op='intersects'

INFO (10:00:43): Found 3 full coverage mosaics.`

BuddyVolly commented 3 years ago

Hi,

it is a problem that we do not fix the version of certain python packages. There have been huge changes to the pyproj lib. Almost all of the geolibs in python use this for projection related stuff, and the syntax changed. You could downgrade to Proj 4 somehow. But it will take me some time to fix all this, as at the moment I do not really have time for it.

Best, Andreas

MNEUW commented 3 years ago

ok thanks for your answer! just let me know when there is an update. meanwhile I'll try on my own start. Best, M

MNEUW commented 3 years ago

Hm, all steps are working now except the creation of the image bounds for the time-series processing. So I think we do not need huge changes. What does this actually? align the scences to same extent? and which script/function is responsible for the job?may you have another hint where I can start digging. sorry for interrupting.

Best, Martin

BuddyVolly commented 3 years ago

Hi, this step is needed to calculate the common minimum bound of all images in the time-series, as their extent may vary (slightly). I think it uses geopandas for that, which is based on pyproj, which again had those huge changes. So I assume something related to this. Here you can find an example of this problem (as many have this) https://gis.stackexchange.com/questions/348997/constant-future-warnings-with-new-pyproj

MNEUW commented 3 years ago

Hi there,

I tried to rerun my dockercontainer using the OpenSARtoolkit after a while and got back an error due to non reaching the copernicus scihub api: Your Copernicus Scihub Username:madduen Your Copernicus Scihub Password: INFO (14:45:38): We failed to connect to the server. Reason: Unauthorized

Do you know if there has been a change/limitation to querry images using the api of Copernicus SciHub?

thanks for your support. best Martin

Am Di., 2. Feb. 2021 um 09:13 Uhr schrieb BuddyVolly < @.***>:

Hi, this step is needed to calculate the common minimum bound of all images in the time-series, as their extent may vary (slightly). I think it uses geopandas for that, which is based on pyproj, which again had those huge changes. So I assume something related to this. Here you can find an example of this problem (as many have this) https://gis.stackexchange.com/questions/348997/constant-future-warnings-with-new-pyproj

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ESA-PhiLab/OpenSarToolkit/issues/44#issuecomment-771452505, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJAMNYWL6Z6VWM7RCUR46MDS46X3XANCNFSM4VC7CT6A .

BuddyVolly commented 3 years ago

Yes, the apihub URL changed https://scihub.copernicus.eu/news/News00868 I still did not have the time do update the toolkit. I am working on a release with the new SNAP 8 and will include this.

MNEUW commented 3 years ago

ok cool. good to know. Do you think it's just the base-url to be replaced in the search.py script in line 399 or is there more work to do be done to get to run? is there a rough timescale on the update? sorry for rushing I just would need to know if it's a matter of days/weeks or a lot longer to be able to plan:)

best martin

Am Mi., 16. Juni 2021 um 18:52 Uhr schrieb BuddyVolly < @.***>:

Yes, the apihub URL changed https://scihub.copernicus.eu/news/News00868 I still did not have the time do update the toolkit. I am working on a release with the new SNAP 8 and will include this.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ESA-PhiLab/OpenSarToolkit/issues/44#issuecomment-862545438, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJAMNYSE247JV5VXAQ6A2FDTTDJD7ANCNFSM4VC7CT6A .

BuddyVolly commented 3 years ago

Hi Martin, so there is a solution I think without having to change the code. You can set the base url in the search function of your class instance. Let's say you initialized your Sentinel1Batch to s1 then: s1.search(base_url='https://scihub.copernicus.eu/dhus/')

However, setting the base_url for DOWNLOAD from scihub is NOT possible. But you can use ASF for download, if you haven't done so yet. You can download 10 files concurrently and it keeps all of the archive, as opposed to scihub where you only have the last year as it is a rolling archive. It is much faster.

Hope that helps, Andreas

MNEUW commented 3 years ago

Thanks for your msg! helps a lot! ASF is really great I always used it:) querying and downloading works again. Now I running into: Exception calling QC Rest API: Connect to qc.sentinel1.eo.esa.int:443... This is probably related to current relocation of url to provide the S1-Orbit files --> https://forum.step.esa.int/t/orbit-file-timeout-march-2021/28621/22

the solution for this won't be that easy probably? or do you have any idea? best regards and thanks again, Martin

PS: I also tried the integration in sepal.io and processed some subsets. really great work. Is there any documentation on what the thresholds for the different variants of outlier-removeal are? (marked in red below) [image: grafik.png]

Am Fr., 18. Juni 2021 um 09:22 Uhr schrieb BuddyVolly < @.***>:

Hi Martin, so there is a solution I think without having to change the code. You can set the base url in the search function of your class instance. Let's say you initialized your Sentinel1Batch to s1 then: s1.search(base_url='https://scihub.copernicus.eu/dhus/')

However, setting the base_url for DOWNLOAD from scihub is NOT possible. But you can use ASF for download, if you haven't done so yet. You can download 10 files concurrently and it keeps all of the archive, as opposed to scihub where you only have the last year as it is a rolling archive. It is much faster.

Hope that helps, Andreas

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ESA-PhiLab/OpenSarToolkit/issues/44#issuecomment-863819518, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJAMNYTCH6UR4BNV5AQAXNDTTLXZXANCNFSM4VC7CT6A .