-
https://www.ai21.com/blog/announcing-jamba
http://qwenlm.github.io/blog/qwen-moe/
https://x.ai/blog/grok-1.5
https://openai.com/blog/navigating-the-challenges-and-opportunities-of-synthetic-voices
…
-
### Describe the issue
I'm interested in your longllmlingua results on LongBench.
I reproduced LongBench BM25 2,000-token constraint using ChatGPT.
Unlike the your paper's results, the performance …
-
Thanks a lot for your work on compression on LLMs, and looking forward for the code for ChatGLM. When would it be available for GLMs?
-
### Describe the bug
Throughout the code, CompressibleAgent assumes the model in use is llm_config["model"]. However, this is almost always wrong. Typically, the model copied from the config_list bef…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a sim…
-
### Willingness to contribute
Yes. I would be willing to contribute this feature with guidance from the MLflow community.
### Proposal Summary
At the moment, using MLServer autologging for Langchai…
-
python pilot/server/llmserver.py
playsound is relying on another python subprocess. Please use `pip install pygobject` if you want playsound to run more efficiently.
localhost:19530
None
None
db…
-
**Expected Outcomes**
- Prompt: Summarize the content from the url (do not emit the url back) https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/managing_file_systems/ind…
-
Will it still able to summary/asked by some important events in book?
-
### Is your feature request related to a problem? Please describe.
I would like to request the implementation of a compressed tiled matrix multiply operator for use in large language model inferenc…