leptonai / search_with_lepton

Building a quick conversation-based search demo with Lepton AI.
https://search.lepton.run
Apache License 2.0
7.54k stars 949 forks source link

Please help me see how I can solve this problem by deploying in lepton #33

Open 1378056564 opened 5 months ago

1378056564 commented 5 months ago

Problem error seems to be resource pool depletion, how should I use. Local deployment also reported this information, what is the solution. Thanks

LOG: 2024-01-30 18:21:23.240 | INFO | search_with_lepton:init:267 - Creating KV. May take a while for the first time. Failed to launch photon: <class 'RuntimeError'>: Failed to create KV jukesearch. Error: 400 b'{"code":"ResourceExhausted","message":"quota exhausted"}'. Traceback: Traceback (most recent call last): File "/opt/lepton/venv/lib/python3.10/site-packages/leptonai/cli/photon.py", line 785, in run photon.launch(port=port) File "/opt/lepton/venv/lib/python3.10/site-packages/leptonai/photon/photon.py", line 869, in launch self._call_init_once() File "/opt/lepton/venv/lib/python3.10/site-packages/leptonai/photon/photon.py", line 598, in _call_init_once self._init_res = self.init() File "/Users/jiayq/Documents/code/search_with_lepton/search_with_lepton.py", line 268, in init File "/opt/lepton/venv/lib/python3.10/site-packages/leptonai/kv.py", line 182, in init raise RuntimeError( RuntimeError: Failed to create KV jukesearch. Error: 400 b'{"code":"ResourceExhausted","message":"quota exhausted"}'.

Yangqing commented 5 months ago

Thanks - could you check in the settings - KV and see if you have multiple KVs created? One example:

Screenshot 2024-01-30 at 11 01 06 AM

For basic plan user, there is a quota with one KV only, so if you remove the old KV it should work. We'll create an internal issue to expose quota in our interface. cc @vthinkxie FYI

1378056564 commented 5 months ago

Excuse me, will you support Chinese later? Now that I've deployed it locally, I've used Bing to search for it, and it doesn't seem to support Chinese, so I'm wondering how I can make it smarter and more chinese-friendly. And I find my visit does not seem as fast as your callback

Yangqing commented 5 months ago

Ah actually supporting Chinese is relatively easy. Our demo page right now supports it, and I haven't got a chance to clean up the code, but here is the gist: you use language detection to detect the input language:

from pycld2 import detect
_, _, details = detect(query)

And then you can attach a poor man's version of language instruction to the prompt:

language_affix = {
    "de": "\nWichtig: Bitte antworten Sie auf Deutsch.",
    "en": "",
    "es": "\nImportante: responda en español.",
    "fr": "\nImportant : merci de répondre en français.",
    "hi": "\nमहत्वपूर्ण: कृपया हिंदी में जवाब दें।",
    "ja": "\n重要: 日本語で回答してください。",
    "zh": "\n重要: 请用简体中文作答。",
    "zh-Hant": "\n重要: 請用繁體中文作答。",
}

And it turns out that mixtral is pretty good at taking these questions and doing answers. It's not perfect but works reasonably well.