ITISFoundation / osparc-simcore

🐼 osparc-simcore simulation framework
https://osparc.io
MIT License
46 stars 27 forks source link

Unable to start #2696

Closed maedoc closed 2 years ago

maedoc commented 2 years ago

Long story short

From the README, I tried the following,

duke@osparc:~/osparc-simcore$ make info build
# setup info:
 Detected OS          : Linux
 SWARM_STACK_NAME     : master-simcore
 DOCKER_REGISTRY      : itisfoundation
 DOCKER_IMAGE_TAG     : latest
 BUILD_DATE           : 2021-12-14T16:10:49Z
 VCS_*
  - ULR                : https://github.com/ITISFoundation/osparc-simcore.git
  - REF                : 90c0c0d0
  - (STATUS)REF_CLIENT : (clean) edb9b927
 DIRECTOR_API_VERSION  : 0.1.0
 STORAGE_API_VERSION   : 0.2.1
 DATCORE_ADAPTER_API_VERSION   : 0.1.0-alpha
 WEBSERVER_API_VERSION : 0.7.0
# dev tools version
 make          : GNU Make 4.2.1
 jq            : jq-1.6
 awk           : GNU Awk 5.0.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.2.0)
 python        : Python 3.8.10
 node          : v10.19.0
 docker        : Docker version 20.10.12, build e91ed57
 docker buildx : github.com/docker/buildx v0.7.1-docker 05846896d149da05f3d6fd1e7770da187b52a247
 docker-compose: docker-compose version 1.29.2, build unknown
.....lots of docker building...
duke@osparc:~/osparc-simcore$ make up-prod
# Ensures swarm is initialized
# Deploy stack master-simcore
Top-level object must be a mapping
make: *** [Makefile:249: up-prod] Error 1

Expected behaviour

The docker-compose command should succeed.

Actual behaviour

$ make up-prod
# Ensures swarm is initialized
# Deploy stack master-simcore
Top-level object must be a mapping
make: *** [Makefile:249: up-prod] Error 1

Steps to reproduce

The above commands were run in a 20.04 Ubuntu VM. I then git cloned the repo and followed the instructions above.

Your environment

Already stated.

pcrespov commented 2 years ago

@maedoc this is strange. I need some more info. It seems to be a problem with the docker swarm. Could you please attach docker compose files (.stack-*.yml) generated in the base folder of your repo after make up-prod ? You can also generated it with

# first generate environment file (WARNING: use fake secret info!)
make .env
make .stack-simcore-production.yml
make .stack-ops.yml
maedoc commented 2 years ago

Thanks for the attention. I got a bit further by ensuring docker-compose is up to date (pip3 install -U docker-compose), but stuck on docker swarm init not (yet) working in my WSL2 environment (this is not a simcore issue though). I will try again in a dedicated VM.

maedoc commented 2 years ago

Ah, it seems docker swarm init without an advertise address works on WSL2. Then up-prod seems to succeed, with just a warning:

~/src/osparc-simcore/.env: line 38: simcore.services.logs,: command not found
pcrespov commented 2 years ago

docker swarm init not (yet) working in my WSL2 environment (this is not a simcore issue though). I will try again in a dedicated VM.

@maedoc FYI: some developers use WSL2 in-house with simcore. Drops us a line in case you still find some issues ...

sanderegg commented 2 years ago

@maedoc I do use WSL2 as development environment (although without the docker for desktop installed - I installed docker directly and only in WSL2). for the record here are the versions I am using

> docker version
Client: Docker Engine - Community
 Version:           20.10.11
 API version:       1.41
 Go version:        go1.16.9
 Git commit:        dea9396
 Built:             Thu Nov 18 00:37:06 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.11
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.9
  Git commit:       847da18
  Built:            Thu Nov 18 00:35:15 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.12
  GitCommit:        7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
❯ docker-compose version
docker-compose version 1.29.2, build 5becea4c
docker-py version: 5.0.0
CPython version: 3.7.10
OpenSSL version: OpenSSL 1.1.0l  10 Sep 2019

For me the --advertise-addr works. But maybe depending on your network configuration, it could be that the script finding the IP in the Makefile finds the wrong one? as reminder it runs hostname --all-ip-addresses | cut --delimiter=" " --fields=1 to find it. And after checking I also see that warning on line 38, and yet another at line 94. Will check.

maedoc commented 2 years ago

Thanks! I saw the WSL2 mention in the README too late, but good to know it's being used also. I'm trying out Win 11 here, maybe they've done more to keep dev ops spicy.

the --advertise-addr works

It complains that the addr provided is not a system address (or similar such message) and won't be reachable. I wonder if Docker just assumes that the 192.168.0.0/10 WSL2 uses (on my machine) won't be reachable.

sanderegg commented 2 years ago

ok first time I hear about that message. One has to know that WSL2 has its own IP address. That --advertise-addr is useful in case several network adapters are present (or sometimes if you have some VPN as well or so), since then docker swarm might complain it does not know which one to use.

Maybe if you could check what the shell script above returns you may see if it returns the real IP of your WSL instance or something incorrect.

maedoc commented 2 years ago

after a docker swarm leave,

$ hostname --all-ip-addresses | cut --delimiter=" " --fields=1
192.168.226.122
$ ip a | grep inet\
    inet 127.0.0.1/8 scope host lo
    inet 192.168.226.122/20 brd 192.168.239.255 scope global eth0
$ docker swarm init --advertise-addr=192.168.226.122
Error response from daemon: must specify a listening address because the address to advertise is not recognized as a system address, and a system's IP address to use could not be uniquely identified
$ docker swarm init --advertise-addr=192.168.226.122 --listen-addr=192.168.226.122
Error response from daemon: manager stopped: failed to listen on remote API address: listen tcp 192.168.226.122:2377: bind: cannot assign requested address

It seems however with --listen-addr=127.0.0.1, Docker is happy to start swarm mode.

sanderegg commented 2 years ago

192.168.226.122, is that your real internal address in your LAN network? (I mean is that also what your windows properties are showing?) For me the IP address I see in WSL2 has nothing to do with the computer real IP address. Maybe that is something that indeed changed in Windows11. I do not see this in my machine (which also runs Win11), but my WSL2 was installed before upgrading to Win11 so maybe they changed this somehow.