OSC / ondemand

Supercomputing. Seamlessly. Open, Interactive HPC Via the Web
https://openondemand.org/
MIT License
277 stars 104 forks source link

Container with OnDemand does not build properly #2272

Open radekj-pcss opened 2 years ago

radekj-pcss commented 2 years ago

When I try do build the conainer running : docker build --tag "ondemand-test" .

Step 6/35 : RUN dnf -y install https://yum.osc.edu/ondemand/latest/ondemand-release-web-latest-1-6.noarch.rpm ---> Running in 67933add911a Rocky Linux 8 - AppStream 0.0 B/s | 0 B 06:00

Errors during downloading metadata for repository 'appstream':

When I do manually curl https://mirrors.rockylinux.org/mirrorlist?arch=x86_64&repo=AppStream-8 I got some file, also I am able to download the ondemand-release-web-latest-1-6.noarch.rpm properly so most probably it is not a networking issue.

┆Issue is synchronized with this Asana task by Unito

treydock commented 2 years ago

Those errors look like issues contacting Rocky Linux repos, nothing to do with OnDemand. Sometimes that happens. Container networking is far different than how curl works outside the container so could also be issue with networking setup of Docker.

Is the issue still occurring? I ask because sometimes issues accessing Rocky Linux repos is temporary and on Rocky's side.

sync-by-unito[bot] commented 2 years ago

➤ treydock commented:

Those errors look like issues contacting Rocky Linux repos, nothing to do with OnDemand. Sometimes that happens. Container networking is far different than how curl works outside the container so could also be issue with networking setup of Docker.

Is the issue still occurring? I ask because sometimes issues accessing Rocky Linux repos is temporary and on Rocky's side.

radekj-pcss commented 2 years ago

Hi You are probably right, the problem probably is on rocky side. It is, unfortunately, still happening. I tried switching to alma with some success, building of the container stops here: Step 20/35 : RUN cd /opt/ood; bundle install ---> Running in 9ab1dcc3faee Don't run Bundler as root. Bundler can ask for sudo if it is needed, and installing your bundle as root will break this application for all non-root users on this machine. Fetching source index from https://rubygems.org/ Retrying fetcher due to error (2/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/ due to underlying error <Net::OpenTimeout: Net::OpenTimeout (https://rubygems.org/specs.4.8.gz)> Retrying fetcher due to error (3/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/ due to underlying error <Net::OpenTimeout: Net::OpenTimeout (https://rubygems.org/specs.4.8.gz)> Retrying fetcher due to error (4/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/ due to underlying error <Net::OpenTimeout: Net::OpenTimeout (https://rubygems.org/specs.4.8.gz)> Could not fetch specs from https://rubygems.org/ due to underlying error <Net::OpenTimeout: Net::OpenTimeout (https://rubygems.org/specs.4.8.gz)> The command '/bin/sh -c cd /opt/ood; bundle install' returned a non-zero code: 17

So kind of similar story, the file is accessible from the machine I am building the container on. I am building this from Openstack hosted VM but it should not have any impact, right? I know, this is not your fault (probably) I think it is good to let you know that Dockerfile you provide fails, event if it is 3rd party fault.

treydock commented 2 years ago

Looks like networking issues with Docker. One way to likely debug and see if really an issue with Docker is something like this:

docker run --rm -it rockylinux:8 /bin/bash
curl -v https://rubygems.org

If the issue is with docker networking then I think what would happen is that curl command from inside the container would time out.

johrstrom commented 2 years ago

Yea I found these docs.

This doc may help it says

By default, traffic from containers connected to the default bridge network is not forwarded to the outside world. 
To enable forwarding, you need to change two settings. 
These are not Docker commands and they affect the Docker host’s kernel.

https://docs.docker.com/network/bridge/#enable-forwarding-from-docker-containers-to-the-outside-world

I've also had to set the daemon.json configs to a given CIDR, though I cannot fully recall why - I'm sure it had something to do with internet connectivity.

radekj-pcss commented 2 years ago

I come exactly to the same conclustions - curl within container is not working but from the host works like charm. I guess the problem lies with MTU - openstack is using 1458 because of vxlan tunneling. I'll try decreasing MTU for containers and check if that was the issue.

johrstrom commented 2 years ago

All that said - and we're happy to help in any way - Why are you trying to use the container instead of installing the RPM? The RPM is likely to be a bit more stable.

johrstrom commented 1 year ago

Hi, were you able to resolve this?