code-kern-ai / refinery

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
https://www.kern.ai
Apache License 2.0
1.4k stars 67 forks source link

[BUG] - Docker fails creating "refinery_default" network #245

Closed rasdani closed 1 year ago

rasdani commented 1 year ago

Describe the bug I managed to ./start and ./stop once and refinery worked well. On second startup ./start throws:

v1.8.0: Pulling from kernai/alfred
Digest: sha256:15fc59b24103ea7e9bebcbf08df1a76fb269778ed54331d320e3f173eaaaf0f1
Status: Image is up to date for kernai/alfred:v1.8.0
docker.io/kernai/alfred:v1.8.0
a25d61c6edc8a6a2b947ba8133d195021ba75fc7194e13b364184cb005790061
Creating docker-compose.yml file...
Creating jwks.json secret if not existing...
Checking and pulling exec env images...
Starting postgres container...
Creating network "refinery_default" with the default driver
could not find an available, non-overlapping IPv4 address pool among the defaults to assign to the network
Waiting for postgres to be ready...
Traceback (most recent call last):
  File "/program/start.py", line 42, in <module>
    if wait_until_postgres_is_ready():
  File "/program/util/postgres_helper.py", line 46, in wait_until_postgres_is_ready
    exit_code, _ = exec_command_on_container(PG_CONTAINER, "pg_isready")
  File "/program/util/docker_helper.py", line 50, in exec_command_on_container
    container = client.containers.list(filters={"name": container_name})[0]
IndexError: list index out of range

To Reproduce Steps to reproduce the behavior:

  1. (maybe spawn many local docker networks?)
  2. ./start

Expected behavior Docker should boot up with already fetched image/dependencies.

Desktop (please complete the following information):

Additional context Starting from cloned git repo. I already tried docker network prune and even docker container prune.

JWittmeyer commented 1 year ago

Hi rasdani,

this looks like the database container (PG_CONTAINER -> graphql-postgres) couldn't be found on startup.

When you run ./start after the error shows, could you enter docker ps in a console and provide the results so we can try to determine if there might be a naming issue?

rasdani commented 1 year ago

docker ps shows no containers even with sudo and -a (since I deleted all trying to troubleshoot). I tried deleting and cloning the repo again, same issue on ./start. Will try pip install next.

EDIT: same error with pip install and refinery start

EDIT2: docker is pulling images again, after running Docker Desktop beforehand, fingers crossed! But didn't run Docker Desktop the first time I got refinery working. :man_shrugging:

JWittmeyer commented 1 year ago

Not having any container would explain the error :D Doesn't explain the missing containers so that's a bit harder to work with. Since alfred (our startup manager container) is pulled and does some things I don't think there is an issue with containers in general.

One message in your proved logs shows: could not find an available, non-overlapping IPv4 address pool among the defaults to assign to the network

Nothing I've seen before but looking around a bit it might be related to a VPN issue. Are you using a VPN? Stackoverflow Link Github Issue link

rasdani commented 1 year ago

My VPN was indeed the culprit! Works likes a charm now :partying_face: Somehow I found only suggestions to clear network/delete containers, anyway thank you! :pray: :relieved:

JWittmeyer commented 1 year ago

Awesome & Happy to help 🙂