-
### Is There an Existing Issue for This?
- [X] I have searched the existing issues
### Where do you intend to apply this feature?
Instill Core, Instill Cloud
### Is your Proposal Related to a Prob…
-
Feel free to use at much as possible of these tutorials but it is also a good excuse to review and re-write.
Some things to keep in mind:
- Start by identifying a real-world problem and/or datas…
-
### Is your feature request related to a problem? Please describe
Currently, our system assigns each model to a unique GPU device. While this approach ensures protection against out-of-memory (OOM) e…
-
### System Info
- 1x H100
- Llama3 8B Instruct
- TensorRT-LLM v0.10.0
- tensorrtllm_backend v0.10.0
- tritonserver 24.06
### Who can help?
@kaiyux
### Information
- [X] The officia…
-
Considering the potential impacts of implementing an additional layer in the retrieval method, by integrating a vectorstore enriched with meta information about the data.
This enhancement could pro…
-
When the code file is long, it will make changes to the file and in places put comments like:
// Rest of the code remains the same
This essentially renders the code file useless.
-
This issue proposes integrating Google's Gemini large language model (LLM) into the Webwright project to provide advanced coding assistance capabilities.
Gemini is a state-of-the-art LLM trained on a…
-
I have identified some opportunities to improve the session management and overall functionality in the llm_tracker.py, client.py, and session.py files. These changes aim to enhance the robustness and…
-
Hello!
Thanks for sharing the details of your implementation. I'm wondering what llama factory template you used for your fine tuning, `alpaca` or `deepseek` or maybe a custom one?
Also did you …
-
If GPU is available in the machine of the user. Instead of using CPU for processing the gif(s) files, using GPU would prove a much more efficient and effective solution in terms of time complexity.
…