Closed haticekaratay closed 6 months ago
After a number of failed attempts I managed to get a "working" Roman build running under Colima on my M2 laptop using linux/amd64 as the platform and with rosetta-vz turned on. To work around one issue, I dropped Ubuntu back from 22.04 to 20.04 to revert the Linux kernel to one which does not use vectors not yet supported by OS-X tools/emulators. I filed a ticket with ITSD regarding getting OS-X 14 which is supposed to help with the missing vectors. To work around failing Ubuntu build fetches from DockerHub, I dropped off STScI VPN while building. I noted some changes have been occurring with jupyter/docker-stacks so pulling those images from DockerHub gets a stale environment and only building from source using OWNER=spacetelescope should be done.
Re-calling one of your pain points above, I was able to build with full build kit caching, but if you want to turn that off a good start(/end?) may be to set all the xxx_CACHE_DIRS vars in infrequent-env to "". Make sure to re-source your setup-env after changes to infrequent-env.
For your issue with pulls of spacetelescope/xxx, ideally that should not be happening, either a different repo should be used (jupyter, which has also moved to quay.io) or the spacetelescope versions should be built locally from source and never pushed to DockerHub. Potentially there is a DockerDesktop setting to skip attempting to pull images from DockerHub when you're really just trying to build them. Colima / the buildkit CLI has a --pull switch to turn on that optimization.
One other thing: I believe I did spot a bug with caching where the very last RUN which is supposed to be fixing permissions on empty mount points is itself cached. For my build like past builds, I didn't need the fix. But if caching remains a persistent problem for you then this change might help:
diff --git a/scripts/add-caching b/scripts/add-caching
index b49974d..125b214 100755
--- a/scripts/add-caching
+++ b/scripts/add-caching
@@ -9,7 +9,7 @@ def add_mounts_to_run(dockerfile, mount_dirs):
result = ""
for line in file:
if "RUN " in line:
- for mount in mount_dirs:
+ for mount in mount_dirs and "CACH_DIRS" not in line:
line = line.replace(
"RUN ", f"RUN --mount=type=cache,mode=0777,target={mount} "
)
NOTE: I waffled a lot on this comment but think this may be a bug regardless of when it appears since it will be operating on the mounted file system instead of empty mount points as intended.
Thanks! The issue with M1-related platform incompatibility has been resolved, and we've successfully built the platform. Thanks, @jaytmiller, for your help! I am closing this issue.
I've been encountering issues when trying to build Docker images for Roman platform. The build script uses a command to compile pip requirements, but it throws an error during the build process. The error message suggests there might be a problem with platform compatibility, cache setup, or authorization.
Error Details:
The build fails at step 17 with the following message:
ERROR: process "/bin/bash -o pipefail -c /opt/common-scripts/env-compile roman-cal /opt/common-env/required.pip" did not complete successfully: exit code: 133
When I checked the Dockerfile, the failing command was:RUN --mount=type=cache,mode=0777,target=/var/cache/apt --mount=type=cache,mode=0777,target=/home/jovyan/.conda/pkgs --mount=type=cache,mode=0777,target=/opt/conda/pkgs --mount=type=cache,mode=0777,target=/home/jovyan/.cache /opt/common-scripts/env-compile roman-cal /opt/common-env/required.pip
The build log indicates a "rosetta error: failed to open elf at /lib64/ld-linux-x86-64.so.2," suggesting a platform mismatch or issue with ARM vs. AMD64 compatibility.Steps I Took:
-platform=linux/amd64
, as I'm building on an M1 (ARM-based) Mac.Additional Issue:
After fixing the platform issue, I encountered a new error:
ERROR: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
I'm not sure how to proceed with resolving these errors. The first error points to platform incompatibility or a missing library, while the second seems to be an authorization issue with Docker Hub. I would appreciate any guidance on how to fix these errors and successfully build my Docker images.