stan-dev / stanc3

The Stan transpiler (from Stan to C++ and beyond).
BSD 3-Clause "New" or "Revised" License
140 stars 44 forks source link

jenkins: update group ids for new host #1375

Closed dylex closed 10 months ago

dylex commented 11 months ago

It looks like there are some hard-coded group ids for running dind, which have changed on the new host. This should fix it, though it might be better to do getent group docker on the host or something.

WardBrian commented 11 months ago

Does this require re-building our docker images? It seems like opam is now raising an error in the build step

dylex commented 11 months ago

I'm not sure why it would, this is just changing arguments to the outer docker. But I'm not sure how exactly these are working to begin with.

WardBrian commented 11 months ago

@serban-nicusor-toptal - any theories on the opam failures?

serban-nicusor-toptal commented 11 months ago

Hey @WardBrian I just tested the image locally and it looks fine which leads me to believe that this is a permission-related issue. I think something changed on the Flatiron side. I left a message for Dylan earlier on Slack (sent some info to you too). I'll keep you posted!

serban-nicusor-toptal commented 11 months ago

Since one of the GIDs changed, I'll have to rebuild the images so they have the correct gid set. I'll commit later today the new tag and rebuild!

serban-nicusor-toptal commented 10 months ago

Look like the pipeline won't run because we need to allow it usage of:

Scripts not permitted to use staticMethod hudson.model.Hudson getInstance. Administrators can decide whether to approve or reject this signature.

I lost my permissions on Flatiron Jenkins so I can't approve it.

It looks like there are some hard-coded group ids for running dind, which have changed on the new host. This should fix it, though it might be better to do getent group docker on the host or something.

I've also lost my linux host access, thus I can't login to check these.

I rebuild the image from scratch, trying to see if we can get rid of the hardcoded UID/GID. If the static one works fine, I'll do the same for the rest of them

dylex commented 10 months ago

For other projects we do things like this:

  def uid = sh(returnStdout: true, script: "id -u").trim()
  def img = docker.build(..., "--build-arg BUILDUID=${uid} .")

You could do something similar with id -G to get all group IDs, or getent group docker to get the specific one.

serban-nicusor-toptal commented 10 months ago

Hey @dylex @WardBrian, sadly these images fail with a permission error when calling dune if components are not install with a UID/GID aligned with the host. I had the same idea as your above Dylan so for the normal builds I just pass the UID and GID based on the host. additionalBuildArgs '--build-arg PUID=\$(id -u) --build-arg PGID=\$(id -g)'


In the same pipeline we leverage QEMU for multiarch builds, this looks like it won't work without sudo/--priviledged. See: https://github.com/stan-dev/stanc3/blob/master/scripts/build_multiarch_stanc3.sh#L26

Now in order to get it to work, I had to pass the following GIDS:

args '--group-add=987 --group-add=980 --group-add=988'

I'll try now to see if I can get id -G to work inside args Edit: Shell expansion doesn't happen inside args for some reason, trying alternatives.

serban-nicusor-toptal commented 10 months ago

Sadly I can't seem to get it to work in a dynamic way. args will not expand and just be set as a static string so we can't use id -G and I even tried to mount /etc/passwd and /etc/group but that does not seem to work either. Reverted to hardcoded, let me know if you guys have any other ideas!

WardBrian commented 10 months ago

I think that is fine. If @dylex has any tips to improve it they'd be welcome to avoid issues in the future, but otherwise this seems good to me