-
### Context
We see that basic chatglm output generated words are repetitive.
![image](https://github.com/openvinotoolkit/openvino.genai/assets/98947310/f237a08b-a4db-4c6d-80f8-83abb762ad89)
###…
-
Some parts of the llama model divide by input values, so providing empty inputs can crash the program or produce NaNs. We should generate real input values from tokenized prompts for testing and bench…
-
**Is your feature request related to a problem? Please describe.**
When generating a complex schema, sometimes I want to interleave generation, templating, and tool use. One way to support this would…
-
Firstly thanks for the awesome work in putting this all together in record time ! One burning question for me is how the current structure of the agents which look to be stateless stacks up against a …
-
### 🚀 The feature, motivation and pitch
Starting from iOS 18, Core ML has state, which is the counterpart of mutable buffer. As a result, ExecuTorch can now let Core ML handle buffer mutation
##…
-
### Ticket Contents
iGOT Karmayogi has a course or a micro-learning content which beautifully explains the concept. However, user is unaware of the course content or how to search for a course or a c…
-
[High-Level Architecture Overview](https://docs.google.com/drawings/d/1TxRxJ7Amy449dvnFgOrnqofX3SfkLD3qqqKXEAfRHsE/edit?usp=sharing)
![LLM-VM Architecture](https://github.com/anarchy-ai/LLM-VM/asse…
-
### Describe the bug
Crashes when using -m ollama/LLaVA -lsv, works fine without -lsv parameter
### Reproduce
Run interpreter -m ollama/llava -lsv
Ask for a visual description
### Expected beha…
-
### Proposal to improve performance
currently, `LLMEngine` (driver) lives in the same process as tensor parallel rank 0 process, which caused a lot trouble for us, e.g. we cannot easily create two in…
-
The current code execute logic has an issue with code generated in several blocks and later blocks depends on the former ones.
For example:
> User_Proxy (to Assistant_Agent):
>
> Write a pyth…