exo-explore / exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
GNU General Public License v3.0
6.56k stars 342 forks source link

About #130, regarding subsequential requests #158

Closed psj900918-r5 closed 3 weeks ago

psj900918-r5 commented 3 weeks ago

Hi, I'm testing with 2 Macbooks. I found the discussion #130 and It works fine with 1 or 2 devices. One thing I hope is, enabling subsequential requests in multiple devices.

For instance, if I pass the same request_id for every request from chatgpt_api, it works in a single device. However, not working in more than one device.

130 fixes this, by changing the request id in every request.

I'm kinda working on the impact of subsequential requests - where the KVs are continuously accumulated. In single device test, all-same-request-id queries shows decreasing speed in each http request(due to the memory limit - I'm using 8GB macbooks!) I hope to check this out while two devices used. With the current code (changing request id), no speed degradation is found (and also no context found in the answers).

At first, I attempted to put the requests through web UI. However, with multiple devices, the web UI stops in 3~4 subsequent requests.

Could you share some ideas or thinkings to make the subsequential requests work?

AlexCheema commented 3 weeks ago

Hey @psj900918-r5 thanks a lot for the detailed issue.

There was some information lost in translation, however I think I understood that the issue was a memory leak - i.e. memory would keep increasing after each request. I was able to reproduce this and have pushed a fix now. Can you please make sure you run pip install . before trying again since I updated a dependency.

Please let me know if this fixes the issue for you. If it did not fix the issue, reopen the GitHub issue.