NIEHS / beethoven

BEETHOVEN is: Building an Extensible, rEproducible, Test-driven, Harmonized, Open-source, Versioned, ENsemble model for air quality
https://niehs.github.io/beethoven/
Other
5 stars 0 forks source link

Containerized pipeline run #334

Open sigmafelix opened 5 months ago

sigmafelix commented 5 months ago

After a long journey of configuring different software versions on HPC (cf. #333 ), I ended up finding countless and inconsistent errors across nodes and sessions in HPC. Now I am trying to move on to a fully containerized approach, where we use an Apptainer image with recent stable versions of GDAL and its dependencies then mount the project root to a container internal path to make the container detect data files. container-engine branch includes ongoing works for that transition. According to this approach, we submit a job with a R script with tar_make() or tar_make_future() command with sufficient amount of threads and memory (e.g., 80 threads and 640GB of memory) to SLURM, then parallelize the workload by crew or future.callr inside the container.

Apptainer image is based on the geospatial:latest Dockerfile available in the rocker-versioned2 repository (Ubuntu 22.04, GDAL 3.4.1).

A very strange behavior was found in vector operations in this approach, where the intersection between the unique sites and the Ecoregion polygons returned the different number of results (1096 in triton run, 1051 in Apptainer run). I attempted to repair the Ecoregion polygons by terra::makeValid() or terra::buffer(x, width=0) in no avail.

I am still working on investigating the issues and try to figure out what the exact cause is; I feel much more efforts are put into this work than what I expected and it is getting more complex as the time goes.

sigmafelix commented 5 months ago

The pipeline runs okay with the custom build GDAL and R packages on GEO. Further investigation on the unconventional behavior is on hold.

kyle-messier commented 5 months ago

Thanks @sigmafelix

sigmafelix commented 4 months ago

renv experiment needs figuring out an undetected GitHub packages such as beethoven and amadeus. Hash, repository URL, and other properties are not populated in renv.lock file when a renv is initiated. I will investigate this issue thoroughly.

sigmafelix commented 3 months ago

After 0.4.0 merge into main, I will update container-engine to match all updates that are non container-related .