Open jpolchlo opened 1 year ago
Almond uses a separate directory for cache. coursier fetch
by default fetch the artifacts to .cache/coursier
(on Linux). You can try to find where almond stores the cache. If I remember correctly it's .cache/almond/coursier
then you can do coursier fetch --cache <almond-coursier-cache-dir> ...
.
That doesn't appear to be the case. Both methods (import from notebook and coursier fetch
) place the jar files in the ~/.cache/coursier
tree. However, there is a file ~/.cache/almond/ammonite/history
that appears to track the notebook imports. The contents after executing
import $ivy.`org.apache.logging.log4j:log4j-core:2.17.0`
are
[
"import $ivy.`org.apache.logging.log4j:log4j-core:2.17.0`"
]
I'm thinking that the way to pre-load is to provide a notebook with the desired inputs and run it through jupyter during the docker build. There appears to be some amount of state that is created in in-notebook imports that coursier fetch
is not replicating.
Edit:
I've been able to preload the container with jars using jupyter execute ...
on a notebook containing import $ivy...
directives. It appears that the import statements in the notebook are still required to register the imported modules in the current context. However, the jar files are now present, and it's not necessary to wait for the maven downloads.
hmm I did not observe this with the docker image I'm using. However, I'm using
ENV COURSIER_CACHE=/usr/share/coursier/cache
in the dockerfile. Does that impact the coursier cache for even the notebook session?
https://github.com/coreyoconnor/nix_configs/blob/dev/modules/ufo-k8s/almond-2/Dockerfile
After further testing. Yes, setting ENV COURSIER_CACHE
will pre-populate as expected.
I want to build a docker environment where I can pre-load the classpath with
spark-sql
and some other stuff to avoid boilerplate in my notebooks. So I built the following Dockerfile:However, upon running this container, running
import org.apache.spark.sql._
yields an error:What step am I missing to get Almond to recognize the coursier-installed jars?