Substra / substra-backend

Backend of the Substra software
https://docs.substra.org
Apache License 2.0
57 stars 15 forks source link

BUG: container kaniko exit 1 Susbtra-backend crash #809

Open SantiagoMoreno-UdeA opened 7 months ago

SantiagoMoreno-UdeA commented 7 months ago

What are you trying to do?

Deploy locally substra following the steps in: https://docs.substra.org/en/stable/how-to/developing-substra/local-deployment.html

Issue Description (what is happening?)

skaffold crash for permissions when running without sudo. Then I ran all the steps with sudo, but when I try to launch substra-backend (sudo skaffold run) the process crash. error_SubstraBackend.txt

Then I try "sudo skaffold run" again and this was the result: error_SubstraBackend_secondTry.txt

CPU: Intel Core i7-1355U RAM: 16GB OS: Ubuntu 22.04.3 LTS 64-bit

Expected Behavior (what should happen?)

The substra backend service would launch.

Reproducible Example

No response

Operating system

ububtu 22.04

Python version

3.10.112

Installed Substra versions

substra==0.49.0
substrafl==0.42.0
substratools==0.21.0

Installed versions of dependencies

helm == v3.14.0 skaffold == v2.1.0 numpy == 1.24.3 pytorch == 2.0.1+cu117

Logs / Stacktrace

error_SubstraBackend.txt error_SubstraBackend_secondTry.txt

guilhem-barthes commented 7 months ago

Hi there,

In both logs we can see the line error building image: Get "https://ghcr.io/v2/": dial tcp: lookup ghcr.io on 10.43.0.10:53: server misbehaving. A server misbehaving error on port 53 is usually linked with a misconfigured DNS resolver. Seeing the IP, it looks like you are not using an external one. Perhaps you could try to change your dns resolver and use any external provider (dns0.eu, Google DNS, Cloudfare dns).

SantiagoMoreno-UdeA commented 7 months ago

Hi @guilhem-barthes!

I changed the dns resolver and add the Google DNS. But now I'm having trouble with the previous step, when I try "sudo skaffold run" for the orchestator, the process wait for the Helm release manager installation and then crash.

Logs for the Services launcher: logs_K3SLaunch.txt I do not see anything weird.

There is not much information, the skaffold process just stop: error_SubstraOrchestator.txt

Second try for the skaffold orchestator: error_SubstraOrchestator_secondTry.txt

Sorry if is something trivial I'm a rookie handling Network Deployment and Substra.

guilhem-barthes commented 7 months ago

Hey @SantiagoMoreno-UdeA!

No worries for your questions. For your new question, I think you should check if kubectl get pods -n ingress-nginx --selector=app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx returns a list of pods with 1 pod with STATUS Running. Is there a specific reason why you run as sudo ?

One of our colleague had a similar problem on the last MacOs docker version (v4.27.1). It was fixed by going to the Docker desktop > Settings > Resources > Network > Untick "Use kernel networking for UDP". I don't think this setting is in the Linux version tho, but you could check if your port 53 is already binded to another processsudo netstat -pna |grep 53