pathwaycom / llm-app

Dynamic RAG for enterprise. Ready to run with Docker,⚡in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
https://pathway.com/developers/templates/
MIT License
3.36k stars 192 forks source link

Local App Variant not Responding to post request #40

Closed Pipboyguy closed 10 months ago

Pipboyguy commented 11 months ago

Logs don't show any activity. Contexfull works as expected

embe-pw commented 11 months ago

Thanks for reporting the issue. Could you provide some more details, including:

embe-pw commented 11 months ago

Note: we are currently investigating an issue with the local variant when running with Docker on an ARM Macs (it seems to be very slow) – if that is your configuration you may want to try running directly using poetry.

Pipboyguy commented 11 months ago

I ran this directly on poetry (for GPU support). This is what raised the issue initially.

I have to add that I was getting segfaults prior to it working, but then not responding to post requests.

embe-pw commented 11 months ago

Could you share some details about the segfaults? Also, could you share the log messages?

Pipboyguy commented 11 months ago

The segfaults were only "core dumped" by the pathway process. No more logs or indication of what went wrong,

For some reason it started working after a couple of more tries, but when I run curl -i --data '{"user": "user", "query": "How to use LLMs in Pathway?"}' http://localhost:8080/

                                                              LOGS
  [08/07/23 14:04:21] INFO     Preparing Pathway computation
                      INFO     FilesystemReader-0: 0 entries (1 minibatch(es)) have been sent to the engine
                      INFO     PythonReader-1: 0 entries (1 minibatch(es)) have been sent to the engine

and nothing else. So I'd label it a timeout because I've been leaving it for quite a while, and I have no small system at 12 cores 64GB RAM. If the logs were more verbose it be better.

embe-pw commented 11 months ago

Could you share what the monitoring display above the logs shows? Do the latencies keep increasing?

embe-pw commented 11 months ago

Could you try running it on kuba/async-cap branch?

Pipboyguy commented 11 months ago
                                                   PATHWAY PROGRESS DASHBOARD

                    no. messages                                  operator   latency to wall clock [ms]   lag to input [ms]
                     in the last     in the last                 ───────────────────────────────────────────────────────────
    connector          minibatch          minute   since start    input                              30
   ────────────────────────────────────────────────────────────   output                             30                   0
    FilesystemRe…              0               0             6
    PythonReader…              0               0             1  Above you can see the latency of input and output operators.
                                                                 The latency is measured as the difference between the time
                                                                   when the operator processed the data and the time when
                                                                                 pathway acquired the data.

                                                              LOGS
  [08/07/23 15:49:40] INFO     Preparing Pathway computation
                      INFO     FilesystemReader-0: 0 entries (1 minibatch(es)) have been sent to the engine
                      INFO     PythonReader-1: 0 entries (1 minibatch(es)) have been sent to the engine
  [08/07/23 15:49:45] INFO     FilesystemReader-0: 6 entries (102 minibatch(es)) have been sent to the engine
  [08/07/23 15:49:46] INFO     PythonReader-1: 1 entries (126 minibatch(es)) have been sent to the engine
  [08/07/23 15:49:50] INFO     FilesystemReader-0: 0 entries (100 minibatch(es)) have been sent to the engine
  [08/07/23 15:49:51] INFO     PythonReader-1: 0 entries (101 minibatch(es)) have been sent to the engine

At first I suspected the delay is huggingface encoders being downloaded in the background, but its been way too long for that to bee the issue. Would be nice to see the logs for any such artefacts being downloaded for visibility

I ran the above on the suggested branch. Same result on the face of it.

mdmalhou commented 10 months ago

I see 6 entries read by the engine which is what's expected (pathway-docs-small data). I couldn't reproduce this non response issue on WSL or Linux. I suspect though that the problem is not enough memory to run the embedding or inference models. Download bars are shown only on the first run before pathway is started as they are cached:

image
mdmalhou commented 10 months ago

I reproduced the Seg fault error on colab. It happens when running one model on gpu (embedder by default runs on gpu if available) and the other on cpu (HFTextGenerationTask has device='cpu' by default). I'll add the device argument for the SentenceTransformerTask right away. Remember to clear the cache directory specified on .env Could you checkout PR50 @Pipboyguy

Pipboyguy commented 10 months ago

@mdmalhou Removes issue on my side. Thanks @mdmalhou