-
Using command line to inference a same image with multiple questions maybe not convenient. If there is a webui that maybe more easier. I make a simple gradio webui demo for this as follow:
```
impor…
-
I have an ollama container running the stable-code:3b-code-q4_0 model. I'm able to interact with the model via curl:
`curl -d '{"model":"stable-code:3b-code-q4_0", "prompt": "c++"}' https://notarea…
-
### Describe the issue
I'm encountering a "memory access out of bounds" error when attempting to run inference using onnxruntime-web within a custom npm package. The inference process works flawles…
-
This is a placeholder for the task that will enable usage of Intel GPU in Triton via OpenVINO.
-
### Problem
The application currently relies on a server endpoint for spoken-language to SignWriting and SignWriting to spoken-language text-to-text translation.
This prevents us from performing…
-
@LukeForeverYoung Hey! Thanks for sharing this amazing work!
Are the model weights and inference code available ?
I would be happy to test them locally.
-
When i try to load and run the onnx model, I am getting the following error message. I ran the code from https://github.com/urchade/GLiNER/blob/main/examples/convert_to_onnx.ipynb to save as onnx mod…
-
从 MinerU 的底层代码来看,似乎每一页 PDF 都是一个独立的处理单元,使用简单的 for-loop 依次处理,不存在拼页凑 block 的步骤。
未来是否考虑加入并行处理的机制,分页后根据资源情况同时处理不同的页对象。最后再按照 page_index 拼接。
理论上可行,但我看了下调用和加载模型的逻辑,不管是协程,多线程还是多进程,在调用 paddle 模型的时候都会有问题。
…
-
### Voice Changer Version
MMVCServerSIO_win_onxxgpu-cuda_v.1.5.3.17b
### Operational System
Windows 11 Home 64-bit (10.0, Build 22621)
### GPU
NVIDIA GeForce RTX 2050
### Read carefully and chec…
-
- [x] Use `llama_decode` instead of deprecated `llama_eval` in `Llama` class
- [ ] Implement batched inference support for `generate` and `create_completion` methods in `Llama` class
- [ ] Add suppo…