nextcloud / context_chat_backend

GNU Affero General Public License v3.0
5 stars 5 forks source link

ExApp context_chat_backend initialization timed out #36

Closed xavierg1909 closed 4 months ago

xavierg1909 commented 6 months ago

Describe the bug ExApp context_chat_backend initialization timed out by one click

To Reproduce Steps to reproduce the behavior:

  1. Go to External Apps
  2. Search Context Chat Backend
  3. Click on Install and Desploy
  4. See error

Expected behavior context_chat_backend enabled

Server logs (if applicable) _nextcloud-aio-nextcloud_logs (2).txt

<paste logs here or attach a file>

Context Chat Backend logs (if applicable, from the docker container) _nc_app_context_chat_backend_logs (5).txt

<paste logs here or attach a file>

Screenshots

image

Setup Details (please complete the following information):

--


Additional context Add any other context about the problem here.

kyteinsky commented 6 months ago

Hello, that is a not a big issue but definitely something to improve. What happens is that the context chat php app tries to index all the users' files but the backend isn't initialised yet, which is why the 503 response.
It would help to disable the php app (occ app:disable context_chat) until the backend is installed. It downloads more than 5 GB so please be patient there :)

How do you arrive at the conclusion that it timed out? Did something happen after this which is not in the logs?

socialize-IT commented 5 months ago

How do you arrive at the conclusion that it timed out? Did something happen after this which is not in the logs?

You can see the timeout message in the posted screenshot.

xavierg1909 commented 5 months ago

Updated: I just did a clean installation, using AIO in the new version 29, without user files, the first thing I did was install the context_chat_backend from AppAPI now in version 2.5.0

Unfortunately the problem persists, it reaches 50% and stays. I was able to validate that it downloads the image and deploys the container but it doesn't work.

backend

image

kyteinsky commented 5 months ago

You can see the timeout message in the posted screenshot.

oops :)

@xavierg1909 Thanks for the update. Would you mind posting the full log of the docker container to see what exactly is going wrong?

xavierg1909 commented 5 months ago

Hello, inexplicably today the AI ​​already works, the strange thing is that the waiting time persists in the context_chat_backend, I leave the capture and the log in case you can correct that. I hope I have helped.

image _nc_app_context_chat_backend_logs.txt _nextcloud-aio-nextcloud_logs (1).txt

kyteinsky commented 5 months ago

That is very helpful indeed. Thank you very much!

The models were downloading when you tried to launch a query with assistant, same with the loading of sources. The error messages definitely need improvement and the context chat php app will also need a check to see if the backend's initialisation has completed before spamming the backend with the documents.

I'll look into AppAPI's reporting of the timeout.

Also, thanks for the logs again, looks like a regression with odfpy being removed 🙈 (it is required to parse ods files).

Will fix all the issues on Monday. Have a nice weekend.

github-actions[bot] commented 4 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

architectonio commented 4 months ago

I got the context_chat_backend deployed, however isn't working. The "ExApps" interface shows “Heartbeat check failed” and “Healthchecking” .... and this after more than 24 hours. After 12 hours waiting, I restarted Docker and then the whole server. Unfortunately, the status remains the same.

kyteinsky commented 4 months ago

hello @architectonio, your issue seems similar. Can you provide me the server logs and the docker container logs?

architectonio commented 4 months ago

hello @kyteinsky , the same happens even with the ExApp "Test Deploy". Which log do you need, "docker_socket_proxy" or "content_chat_backend"? Please let me know.

PS. Since the "content_chat_backend" ExApp wasn't working I removed. Therefore I'll deploy again if you need its log.

kyteinsky commented 4 months ago

The content_chat_backend's logs, please.

architectonio commented 4 months ago

I am trying to deploy it again but deployment process remains at 0%. The docker_socket_proxy container is up and running, and reachable. Now I am going to restart the the Web Server, PHP and docker daemon hoping I get the ExApp deployed.

kyteinsky commented 4 months ago

Do you see any server error logs regarding this?

architectonio commented 4 months ago

Here the Log:

Detecting hardware... Detected hardware: cuda Config file already exists in the persistent storage ("/nc_app_context_chat_backend_data/config.yaml"). App config: { "debug": true, "disable_aaa": false, "httpx_verify_ssl": true, "use_colors": true, "uvicorn_workers": 1, "disable_custom_model_download": false, "model_download_uri": "https://download.nextcloud.com/server/apps/context_chat_backend", "vectordb": [ "chroma", { "is_persistent": true } ], "embedding": [ "instructor", { "model_name": "hkunlp/instructor-base", "model_kwargs": { "device": "cuda" } } ], "llm": [ "llama", { "model_path": "dolphin-2.2.1-mistral-7b.Q5_K_M.gguf", "n_batch": 10, "n_ctx": 8192, "n_gpu_layers": -1, "template": "<|im_start|> system \nYou're an AI assistant named Nextcloud Assistant, good at finding relevant context from documents to answer questions provided by the user. <|im_end|>\n<|im_start|> user\nUse the following documents as context to answer the question at the end. REMEMBER to excersice source critisicm as the documents are returned by a search provider that can return unrelated documents.\n\nSTART OF CONTEXT: \n{context} \n\nEND OF CONTEXT!\n\nIf you don't know the answer or are unsure, just say that you don't know, don't try to make up an answer. Don't mention the context in your answer but rather just answer the question directly. \nQuestion: {question} Let's think this step-by-step. \n<|im_end|>\n<|im_start|> assistant\n", "no_ctx_template": "<|im_start|> system \nYou're an AI assistant named Nextcloud Assistant.<|im_end|>\n<|im_start|> user\n{question}<|im_end|>\n<|im_start|> assistant\n", "end_separator": "<|im_end|>", "model_kwargs": { "device": "cuda" } } ] }

App disabled at startup INFO: Started server process [1] INFO: Waiting for application startup. TRACE: ASGI [1] Started scope={'type': 'lifespan', 'asgi': {'version': '3.0', 'spec_version': '2.0'}, 'state': {}} TRACE: ASGI [1] Receive {'type': 'lifespan.startup'} TRACE: ASGI [1] Send {'type': 'lifespan.startup.complete'} INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:23001 (Press CTRL+C to quit)

kyteinsky commented 4 months ago

looks perfect. At this point app_api should've called the /init endpoint for model download. I'll check with AppAPI's devs about the issue.

architectonio commented 4 months ago

OK. Is there anything I still can do?

kyteinsky commented 4 months ago

You may manually download the gguf and tar.gz files from here: https://download.nextcloud.com/server/apps/context_chat_backend/

Extract the tar.gz file and place both the extracted folder and the gguf file in the docker volume attached to the context chat's container (mounted at /nc_ something). docker cp should work for this purpose.

architectonio commented 4 months ago

OK, I'll try to figure it out and give you a feedback. Thank you a lot

architectonio commented 4 months ago

Is this the Volume you are referring to?

docker volume inspect nc_app_context_chat_backend_data

[ { "CreatedAt": "2024-05-29T07:19:08+02:00", "Driver": "local", "Labels": null, "Mountpoint": "/var/lib/docker/volumes/nc_app_context_chat_backend_data/_data", "Name": "nc_app_context_chat_backend_data", "Options": null, "Scope": "local" } ]

and if yes, I just need to proceed as you wrote by coping the content of downloaded files in to the volume, and I guess then restarting the container?

kyteinsky commented 4 months ago

yes and yes. Remember to extract the tar.gz file before moving it to the volume.

kyteinsky commented 4 months ago

btw, since you said the Test Deploy doesn't work for you. Would you mind creating an issue with the relevant details (server, socket proxy logs and maybe the console logs) and screenshots here: https://github.com/cloud-py-api/app_api/issues ?

architectonio commented 4 months ago

I did it. https://github.com/cloud-py-api/app_api/issues/300 Please let me know if the created issue has all information you need

kyteinsky commented 4 months ago

Thank you, replying there and closing this issue.

architectonio commented 4 months ago

@kyteinsky, me again. Even after copying the files into the Docker Volume and restarting the Docker daemon, (unfortunately) the issue remains the same. Here the content of the volume: ls -l /var/lib/docker/volumes/nc_app_context_chat_backend_data/_data total 5011180 -rw-r--r-- 1 root root 3207 May 29 07:19 config.yaml -rwxr--r-- 1 root root 5131421440 Jun 7 13:14 dolphin-2.2.1-mistral-7b.Q5_K_M.gguf drwxr-xr-x 4 root root 4096 Dec 21 12:41 hkunlp_instructor-base drwxr-x--- 2 root root 4096 May 29 07:19 model_files -rw-r--r-- 1 root root 32 Jun 7 15:13 repair.info drwxr-x--- 2 root root 4096 May 29 07:19 vector_db_data

Since even the Test Deploy ExApp isn't working well, now I believe that something is wrong with the docker socket proxy.

docker ps 3d02fb1d6b04 ghcr.io/cloud-py-api/nextcloud-appapi-dsp:release "/bin/bash start.sh" 9 days ago Up 9 minutes (healthy) 0.0.0.0:2375->2375/tcp, :::2375->2375/tcp nextcloud-appapi-dsp

docker inspect nextcloud-appapi-dsp

[ { "Id": "3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e", "Created": "2024-05-29T05:12:18.405325929Z", "Path": "/bin/bash", "Args": [ "start.sh" ], "State": { "Status": "running", "Running": true, "Paused": false, "Restarting": false, "OOMKilled": false, "Dead": false, "Pid": 1746048, "ExitCode": 0, "Error": "", "StartedAt": "2024-06-07T13:13:09.681232788Z", "FinishedAt": "2024-06-07T13:13:00.795691168Z", "Health": { "Status": "healthy", "FailingStreak": 0, "Log": [ { "Start": "2024-06-07T15:22:28.698304527+02:00", "End": "2024-06-07T15:22:28.773476851+02:00", "ExitCode": 0, "Output": "" }, { "Start": "2024-06-07T15:22:38.824927991+02:00", "End": "2024-06-07T15:22:38.888937822+02:00", "ExitCode": 0, "Output": "" }, { "Start": "2024-06-07T15:22:48.942928435+02:00", "End": "2024-06-07T15:22:49.014493303+02:00", "ExitCode": 0, "Output": "" }, { "Start": "2024-06-07T15:22:59.177293415+02:00", "End": "2024-06-07T15:22:59.244118844+02:00", "ExitCode": 0, "Output": "" }, { "Start": "2024-06-07T15:23:09.296206704+02:00", "End": "2024-06-07T15:23:09.368839036+02:00", "ExitCode": 0, "Output": "" } ] } }, "Image": "sha256:b18e1d6b96402528193c61812287c10464db01d47e6f29cdecd91f9745b9453f", "ResolvConfPath": "/var/lib/docker/containers/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e/resolv.conf", "HostnamePath": "/var/lib/docker/containers/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e/hostname", "HostsPath": "/var/lib/docker/containers/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e/hosts", "LogPath": "/var/lib/docker/containers/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e-json.log", "Name": "/nextcloud-appapi-dsp", "RestartCount": 0, "Driver": "overlay2", "Platform": "linux", "MountLabel": "", "ProcessLabel": "", "AppArmorProfile": "unconfined", "ExecIDs": null, "HostConfig": { "Binds": [ "/var/run/docker.sock:/var/run/docker.sock" ], "ContainerIDFile": "", "LogConfig": { "Type": "json-file", "Config": {} }, "NetworkMode": "bridge", "PortBindings": { "2375/tcp": [ { "HostIp": "", "HostPort": "2375" } ] }, "RestartPolicy": { "Name": "always", "MaximumRetryCount": 0 }, "AutoRemove": false, "VolumeDriver": "", "VolumesFrom": null, "CapAdd": null, "CapDrop": null, "CgroupnsMode": "private", "Dns": [], "DnsOptions": [], "DnsSearch": [], "ExtraHosts": null, "GroupAdd": null, "IpcMode": "private", "Cgroup": "", "Links": null, "OomScoreAdj": 0, "PidMode": "", "Privileged": true, "PublishAllPorts": false, "ReadonlyRootfs": false, "SecurityOpt": [ "label=disable" ], "UTSMode": "", "UsernsMode": "", "ShmSize": 67108864, "Runtime": "runc", "ConsoleSize": [ 0, 0 ], "Isolation": "", "CpuShares": 0, "Memory": 0, "NanoCpus": 0, "CgroupParent": "", "BlkioWeight": 0, "BlkioWeightDevice": [], "BlkioDeviceReadBps": null, "BlkioDeviceWriteBps": null, "BlkioDeviceReadIOps": null, "BlkioDeviceWriteIOps": null, "CpuPeriod": 0, "CpuQuota": 0, "CpuRealtimePeriod": 0, "CpuRealtimeRuntime": 0, "CpusetCpus": "", "CpusetMems": "", "Devices": [], "DeviceCgroupRules": null, "DeviceRequests": null, "KernelMemory": 0, "KernelMemoryTCP": 0, "MemoryReservation": 0, "MemorySwap": 0, "MemorySwappiness": null, "OomKillDisable": null, "PidsLimit": null, "Ulimits": null, "CpuCount": 0, "CpuPercent": 0, "IOMaximumIOps": 0, "IOMaximumBandwidth": 0, "MaskedPaths": null, "ReadonlyPaths": null }, "GraphDriver": { "Data": { "LowerDir": "/var/lib/docker/overlay2/d35b8b6da014cdf53a6df2d2494cafb004c1a4e0b87ad19e720e2c12d9d03347-init/diff:/var/lib/docker/overlay2/ac50879b0255f066f84a0b47b3206e028c8af2e8e6503fdeb753b5b237cd68ef/diff:/var/lib/docker/overlay2/591d1b230c4b8d5f2ddd4172ebc91315460942fc4ad9da127af178a577f8c186/diff:/var/lib/docker/overlay2/59df22bf62d3de6be5d4c8130855dfb480783789b20ce8fbc2dff9589a8d6d50/diff:/var/lib/docker/overlay2/c7bf4c9593cb13d483dd3fba00f4ae3315b08624005f0e12229280c262abc020/diff:/var/lib/docker/overlay2/e5f5509fbdfe27b76beee06ad3525dcb41f2d57a4d11d81ff1132f601f2d09da/diff:/var/lib/docker/overlay2/06e8da6f1c2cd34e2abac9cc5da702709fbe93367b12a600b426322ea4f68f44/diff:/var/lib/docker/overlay2/a6fcf681693b8cc6e1945e064ee0ea150f417ed19eb2435bfc66e5e81e7475ad/diff:/var/lib/docker/overlay2/1740c94b57f2ac262c2da7227b98fa3c5d7e23ffb33ca22468aa5c46bdbb931e/diff:/var/lib/docker/overlay2/1f8575e8282f19d381d1e4e88d1de7bbf510047c61b3de2c20040a56b3d458ad/diff:/var/lib/docker/overlay2/fa767248d51a8f064e893d5731c64382eed07123a94e69e05a114c6fb70bf33f/diff", "MergedDir": "/var/lib/docker/overlay2/d35b8b6da014cdf53a6df2d2494cafb004c1a4e0b87ad19e720e2c12d9d03347/merged", "UpperDir": "/var/lib/docker/overlay2/d35b8b6da014cdf53a6df2d2494cafb004c1a4e0b87ad19e720e2c12d9d03347/diff", "WorkDir": "/var/lib/docker/overlay2/d35b8b6da014cdf53a6df2d2494cafb004c1a4e0b87ad19e720e2c12d9d03347/work" }, "Name": "overlay2" }, "Mounts": [ { "Type": "bind", "Source": "/var/run/docker.sock", "Destination": "/var/run/docker.sock", "Mode": "", "RW": true, "Propagation": "rprivate" } ], "Config": { "Hostname": "nextcloud-appapi-dsp", "Domainname": "", "User": "root", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "ExposedPorts": { "2375/tcp": {} }, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "HAPROXY_VERSION=2.9.7", "HAPROXY_URL=https://www.haproxy.org/download/2.9/src/haproxy-2.9.7.tar.gz", "HAPROXY_SHA256=d1a0a56f008a8d2f007bc0c37df6b2952520d1f4dde33b8d3802710e5158c131", "HAPROXY_PORT=2375", "BIND_ADDRESS=*", "EX_APPS_NET=localhost", "EX_APPS_COUNT=30", "TIMEOUT_CONNECT=10s", "TIMEOUT_CLIENT=30s", "TIMEOUT_SERVER=1800s" ], "Cmd": null, "Healthcheck": { "Test": [ "CMD-SHELL", "/healthcheck.sh" ], "Interval": 10000000000, "Timeout": 10000000000, "Retries": 9 }, "Image": "ghcr.io/cloud-py-api/nextcloud-appapi-dsp:release", "Volumes": null, "WorkingDir": "/", "Entrypoint": [ "/bin/bash", "start.sh" ], "OnBuild": null, "Labels": { "com.centurylinklabs.watchtower.enable": "false" }, "StopSignal": "SIGUSR1" }, "NetworkSettings": { "Bridge": "", "SandboxID": "98f89015eaaad8d3b92a8a9ce141f27c28595c734dfbe60327423e7bfc23c9ac", "HairpinMode": false, "LinkLocalIPv6Address": "", "LinkLocalIPv6PrefixLen": 0, "Ports": { "2375/tcp": [ { "HostIp": "0.0.0.0", "HostPort": "2375" }, { "HostIp": "::", "HostPort": "2375" } ] }, "SandboxKey": "/var/run/docker/netns/98f89015eaaa", "SecondaryIPAddresses": null, "SecondaryIPv6Addresses": null, "EndpointID": "234dba6a71296e5c51a88d66de5bfacc1d545a09508fddf37bc9b1af36ca1f11", "Gateway": "172.17.0.1", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAddress": "172.17.0.3", "IPPrefixLen": 16, "IPv6Gateway": "", "MacAddress": "02:42:ac:11:00:03", "Networks": { "bridge": { "IPAMConfig": null, "Links": null, "Aliases": null, "NetworkID": "18400b33df6b221a3abcff93e6bbb499aedc186f4ad5dc8c44b90d798464e69f", "EndpointID": "234dba6a71296e5c51a88d66de5bfacc1d545a09508fddf37bc9b1af36ca1f11", "Gateway": "172.17.0.1", "IPAddress": "172.17.0.3", "IPPrefixLen": 16, "IPv6Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "MacAddress": "02:42:ac:11:00:03", "DriverOpts": null } } } } ]

My Nextcloud instance talks with nextcloud-appapi-dsp through docker.sock (/var/run/docker.sock)

kyteinsky commented 4 months ago

Ah my bad, you need to move dolphin-2.2.1-mistral-7b.Q5_K_M.gguf and hkunlp_instructor-base inside model_files folder. Restarting the container should enable it by default.

architectonio commented 3 months ago

OK I did it. I am going to do some test and then give you a feedback. Thanks again

architectonio commented 3 months ago

After a lot of tests, including removing and reinstalling images and containers, and of course adding/copying the files above into the container, unfortunately nothing has changed. I am going to wait until a running app is released and then give another try.

architectonio commented 3 months ago

Any news on this?