Closed danscales closed 3 months ago
@danscales: recommending again testing out libmamba solver that may give you more speed up.. i got an order of magnitude speed up on personal project (https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community).
@danscales: recommending again testing out libmamba solver that may give you more speed up.. i got an order of magnitude speed up on personal project (https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community).
@solomon-negusse I think the much newer conda (downloaded by the much newer Miniconda3) may be using libmamba. The solver is much faster and only takes about 15 seconds. There is only one conda line now, since we load everything from the same conda-forge repo (don't use the default anaconda repo). The total time for the update is now only a minute, of which the 45 seconds remaining (i.e. which is not the solving) is downloading and unpacking the 2GB of packages.
GTC-2683 Upgrade to GDAL 3.8.3.
This allows upgrading to Miniconda3 as well (GTC-2774), which is much more recent that the old Miniconda that we've been using and seems to run quite a bit faster. GDAL 3.8.3 is also compatible with EMR-serverless, if we want to use it for certain jobs.
Created a new Dockerfile ci/Dockerfile for the docker that runs the github CI tests, since there is no quay.io image (what we were using previously) for GDAL 3.8.3. Change .github/workflows/ci.yaml to use this new docker image (which I built and uploaded separately).
Removed the top-level Dockerfile and entrypoint.sh, which are very old versions of what is needed to run the analyses in batch jobs. The current versions of these are in gfwpro-scheduler:src/docker.
Added some info in README.md about various files, including sbt, ci/Dockerfile, and scripts/gdal.sh.
Added new geotrellisGdalWarp dependency, needed for the upgraded GDAL.
Includes the new scripts/gdal.sh, which is the bootscript needed for EMR runs with GDAL 3.8.3. It uses Miniconda3 and avoids using the default anaconda repository. It seems to run in about 1 minute, whereas the old script took roughly 4 minutes.
Print out all environment variables when starting up Geotrellis, just as a way to debug various startup/configuration problem.