cloud-py-api / ai_image_generator_bot

Nextcloud AI image generator Talk bot
https://apps.nextcloud.com/apps/ai_image_generator_bot
MIT License
5 stars 0 forks source link

Deployment stuck at 0% #6

Open red3333 opened 6 months ago

red3333 commented 6 months ago

I followed the tutorial on my Nextcloud 28: installed exApp, created a daemon worker, added AiImageGeneratorBot app. After several hours, the progress remained stuck at "0% deploying", then I finished getting a heartbeat failure. I checked my docker install:

Not sure if it is important, but I don't have any GPU on that server, but plenty of RAM an CPU cores. Anyone with an idea ?

bigcat88 commented 6 months ago

Can you describe you configuration? Where is installed Nextcloud, where is docker located? Also can you ping from docker container nextcloud instance?

Do you use Docker Socket Proxy or not?

red3333 commented 6 months ago

All Machines are on a ESXi hypervisor:

bigcat88 commented 6 months ago
  • I can curl my Nextcloud main page from docker and from inside the nc_app_ai_image_generator_bot container

Strange, then It should work.. I assume you can curl that url that is inside Daemon config when you create one? Can you show part of that url, maybe it is not a valid one..

red3333 commented 6 months ago

I switched to wget for my tests as the url is https : wget https://my.domain.name/index.php gives me the main page from the docker machine, the Daemon container and the nc_app_ai_image_generator_bot container. The same url is configured in the exApp Daemon configuration (and is put in the NEXTCLOUD_URL= env variable of the generator bot)

Also tried the https proxy daemon. Same results as previous.

red3333 commented 6 months ago

So I found a problem with cloud-py-api/docker-socket-proxy not forwarding request to app (eg. the /heartbeat was seen in docker-socket-proxy logs, but not in nc_app_ai_image_generator_bot container). I don't know the exact reason, but replacing localhost by 127.0.0.1 in haproxy_ex_apps.cfg solved the problem.

now, my nc_app_ai_image_generator_bot has received the /heartbeat and the /init requests : TRACE: 127.0.0.1:54674 - HTTP connection made TRACE: 127.0.0.1:54674 - ASGI [2] Started scope={'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('127.0.0.1', 23000), 'client': ('127.0.0.1', 54674), 'scheme': 'http', 'root_path': '', 'headers': '<...>', 'state': {}, 'method': 'GET', 'path': '/heartbeat', 'raw_path': b'/heartbeat', 'query_string': b''} TRACE: 127.0.0.1:54674 - ASGI [2] Send {'type': 'http.response.start', 'status': 200, 'headers': '<...>'} INFO: 127.0.0.1:54674 - "GET /heartbeat HTTP/1.1" 200 OK TRACE: 127.0.0.1:54674 - ASGI [2] Send {'type': 'http.response.body', 'body': '<15 bytes>'} TRACE: 127.0.0.1:54674 - ASGI [2] Completed TRACE: 127.0.0.1:54674 - ASGI [3] Started scope={'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('127.0.0.1', 23000), 'client': ('127.0.0.1', 54674), 'scheme': 'http', 'root_path': '', 'headers': '<...>', 'state': {}, 'method': 'POST', 'path': '/init', 'raw_path': b'/init', 'query_string': b''} TRACE: 127.0.0.1:54674 - ASGI [3] Send {'type': 'http.response.start', 'status': 200, 'headers': '<...>'} INFO: 127.0.0.1:54674 - "POST /init HTTP/1.1" 200 OK TRACE: 127.0.0.1:54674 - ASGI [3] Send {'type': 'http.response.body', 'body': '<2 bytes>'} TRACE: 127.0.0.1:54674 - HTTP connection lost

... but still no visible activity on nc_app_ai_image_generator_bot : no CPU usage, no RAM usage (actually, some RAM usage, but in OS cache, may be related to other containers), no other message. In Nextcloud, the app is now at "0% initialization", status: "initialization timed out".

red3333 commented 6 months ago

I tracked the error a bit further:

"/heartbeat" works good.

The "/init" request returns a successfull "OK". But then the set_init_status generates traceback: ERROR: Exception in ASGI application Traceback (most recent call last): [...] nc_py_api._exceptions.NextcloudException: [401] Unauthorized <request: PUT /ocs/v1.php/apps/app_api/apps/status/ai_image_generator_bot> models--stabilityai--sdxl-turbo is therefore not downloaded. I then achieved to download the models from outside the container, and put the result in the persistent storage of the container.

After a (long) time, the app becomes "Initialization timeout 0%" and keeps waiting... ...but the Nextcloud app_api sends a "/enabled?enabled=1" request, which returns a successful "OK". In Nextcloud, the app remains "Initialization timeout", but the bot can be added to the Talk app, and requests (eg. @image cinematic portrait of fluffy cat with black eyes) successfully generate images.