Closed dimakuv closed 11 months ago
Should we also support concatination of PATH
and any ENV retrieved from original docker image in addition to LD_LIBRARY_PATH
?
I have seen requirement for concatination of PATH
env in java
workload case.
We don't define LD_PRELOAD
in any GSC template but retrieved from original docker image if exists, in such a case it's not possible to add LD_PRELOAD
in application manifest. I have seen this case as well in PyTorch workload (DLRM, RESNET50 etc..)
Please assign the issue to me.
I agree with concatenating LD_LIBRARY_PATH
, PATH
and LD_PRELOAD
from the three sources, in this order:
I think this order is the most correct one.
Also, I would like some kind of warning from GSC about this behavior, so that it is visible to casual users (not only to those who browse GSC documentation and sources).
@mkow @woju Would be good to hear your opinion. Also, should this behavior be gated by a new GSC command-line option?
I don't understand the problem that is being solved. Why paths extracted from the original image (second bullet point) do not work? Can you point to an example container that fails, with an explanation why this happens?
I'm not opposed to this change, as long as I'll get the need for it explained better :) (same as woju)
envvar in the user-supplied manifest to some temporary name (so that it doesn't conflict with the entrypoint manifest template)
Btw. Why a temporary name? Can't we just merge the arrays whenever we merge some configs?
Btw. Why a temporary name? Can't we just merge the arrays whenever we merge some configs?
Yes, I think we can just merge. I don't remember why I came up with this temporary-rename approach, probably because the two TOML root dicts are just merged together at some point... Anyway, yes, we could merge the configs in a more careful and clean way, this was just my quick evaluation of the approach.
Explanation on the need of concatenation for:
LD_LIBRARY_PATH
: GSC template entrypoint.manifest.template add LD_LIBRARY_PATH
. In openvino case we wanted to add /ovms/lib
path to LD_LIBRARY_PATH
which we could not do as GSC template above already have LD_LIBRARY_PATH and we cannot have two keys with same name in the manifest, then we had to add all the libraries to LD_PRELOAD.PATH
: We had to add java
bin path to PATH
env in application manifest for JDK8 to work but could not do as PATH
is already added in entrypoint.common.template.manifest and we cannot have two keys in the manifest.LD_PRELOAD
or any XYZ
env in original docker image: We added libgomp
to LD_PRELOAD
in application manifest for PyTorch DLRM workload and GSC build failed with Duplicate key
error. We were surprised by this error as we could not see LD_PRELOAD
defined anywhere in GSC templates. After some debugging figured out LD_PRELOAD
is also fetched from original docker image. Now we had LD_PRELOAD
coming from two places one from original docker image and other from application manifest. Similar case can happen with any other XYZ
env also defined in original docker image and manifest want to concat or override.These are some examples but there could be more which customer might be handling by modifying GSC code or might face in future. In some cases, there could be need to concat or override. To handle this, application manifest env value should be added first than GSC template values and at the end values from original docker image with some warning during GSC build as suggested by Dmitrii.
I can add a couple more Gramine-specific examples:
LD_LIBRARY_PATH
-- applications may rely on the OpenMP library. There are two flavors of the OpenMP library: libgomp.so
(classic GNU gcc implementation) and its drop-in replacement libiomp.so
(Intel-optimized implementation). Some apps -- I think OpenVINO -- prefer to load libiomp.so
over libgomp.so
if they find the former in the library paths. Also, for Gramine, libgomp.so
is bad because it uses raw SYSCALL instructions, whereas libiomp.so
is good because it uses Glibc syscall wrappers.
So we have a situation where the original Docker image wants to use libgomp.so
, and our Gramine Docker image wants to use libiomp.so
-- for perf reasons. The way to allow both scenarios in the same Docker image is to specify loader.env.LD_LIBRARY_PATH = "/path/to/libiomp/"
in the GSC manifest. So the final Docker image will contain loader.env.LD_LIBRARY_PATH = "/path/to/libiomp/:/path/gramine/runtime/:classic-paths"
. This is currently not allowed.
LD_PRELOAD
-- the same use case as above. Imagine that the original Docker image already uses LD_PRELOAD=/some/irrelevant/lib
, and we also want to add LD_PRELOAD=/libiomp.so
for Gramine purposes. Ideally we want in the final Gramine Docker image this: loader.env.LD_PRELOAD=/libiomp.so:/some/irrelevant/lib
. This is currently not allowed.
LD_PRELOAD
again -- some memory allocators are better for Gramine than others. E.g. mimalloc and jemalloc are known to provide better performance under Gramine than the classic dlmalloc. A typical way to use these drop-in replacement libs is to preload them. But again, the original Docker image may already use LD_PRELOAD=/some/irrelevant/lib
, and we also want to add LD_PRELOAD=/libmimalloc.so
for Gramine purposes. Ideally we want in the final Gramine Docker image this: loader.env.LD_PRELOAD=/libmimalloc.so:/some/irrelevant/lib
. This is currently not allowed.
PATH
-- I don't know any good examples, why Gramine would want an additional path to what is already in the original Docker image... But I could imagine that Gramine would e.g. want to use some tools/binaries that are tailored for SGX environments. For example, maybe busybox
instead of dash
as your shell: the final Docker image would use dash
for normal apps and busybox
for the Gramine app. This could be achieved by doing loader.env.PATH=/path/to/busybox:classic-paths
.
@jkr0103: But why these envs weren't set in the source Docker image? How did this even work?
From the issue description:
GSC automatically creates an
LD_LIBRARY_PATH
from [...] The library paths extracted from the original Docker image
So, if the Docker image works then it should also work in Gramine without changes.
Only @dimakuv's examples make sense to me and I think they justify adding this feature.
Description of the problem
Currently GSC doesn't allow to specify an
LD_LIBRARY_PATH
in the manifest file. This is because GSC automatically creates anLD_LIBRARY_PATH
from two sources:entrypoint.manifest.template
See these relevant parts:
The correct solution would probably be:
gsc.py
should renameLD_LIBRARY_PATH
envvar in the user-supplied manifest to some temporary name (so that it doesn't conflict with the entrypoint manifest template)finalize_manifest.py
should take this temporary name and concat it together with{{ library_paths }}
, removing duplicates along the way.LD_LIBRARY_PATH
, so we must think carefully what is concatenated first.