c0c0n3 / kitt4sme.live

On a mission to bring AI to the shop floor: https://kitt4sme.eu/
MIT License
1 stars 28 forks source link

update fams to latest version #232

Closed vcutrona closed 1 year ago

vcutrona commented 1 year ago

This pull request updates FaMS to the latest version

c0c0n3 commented 1 year ago

@vcutrona I rebased this branch on main and ported over your change from #224. Also, I updated the image generation script with the new username and then regenerated the actual sealed secret using the password you gave me.

there's only one snag. When I test this change in prod, K8s gets stuck on pulling the image. Here's the error I get

31s         Normal    Pulling             pod/fams-5f6c9cf4f8-lcwx2    Pulling image "gitlab-core.supsi.ch:5050/dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams:0.1.2-k4s"
1s          Warning   Failed              pod/fams-5f6c9cf4f8-lcwx2    Failed to pull image "gitlab-core.supsi.ch:5050/dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams:0.1.2-k4s": rpc error: code = Unknown desc = failed to pull and unpack image "gitlab-core.supsi.ch:5050/dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams:0.1.2-k4s": failed to resolve reference "gitlab-core.supsi.ch:5050/dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams:0.1.2-k4s": failed to do request: Head "https://gitlab-core.supsi.ch:5050/v2/dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams/manifests/0.1.2-k4s": dial tcp 193.5.152.67:5050: i/o timeout

It looks like K8s isn't able to pull the image within 30 secs and it gives up. I've checked I can actually pull the image with the user and password you gave me

$ docker login https://gitlab-core.supsi.ch:5050
Username: gitlab+deploy-fams-k4s-cloud
Password: 
Login Succeeded

$ docker pull gitlab-core.supsi.ch:5050/dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams:0.1.2-k4s
0.1.2-k4s: Pulling from dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams
...
Digest: sha256:9952389d3f41f00e0472965f4f11277c7c68095d4d8b27e161e26505dd69351f
Status: Downloaded newer image for gitlab-core.supsi.ch:5050/dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams:0.1.2-k4s
gitlab-core.supsi.ch:5050/dti-isteps/spslab/human-robot-interaction/human-digital-twin/fams:0.1.2-k4s

Could it be the SUPSI server is too slow at serving images? Need some help to debug this...

c0c0n3 commented 1 year ago

@vcutrona I'm going to merge this PR anyway since it's most likely a transient error.

vcutrona commented 1 year ago

Thanks @c0c0n3 Just to be 100% sure, please confirm you tried to issue the login+pull commands from the same machine running k8s. If not, it may be SUPSI firewall blocking requests from that machine (we've already experienced some issues due to our geo-fencing policies)

c0c0n3 commented 1 year ago

@vcutrona good catch, I think you're absolutely right about geofencing. I opened #233 to track the issue.