-
safetensorsファイルではなくcsv形式のデータを読み込んで初期値としたい。
VarStore
```rust
pub struct VarStore {
pub variables_: Arc,
//
-
### Check for existing issues
- [X] Completed
### Describe the bug / provide steps to reproduce it
Summary: Attempting to override the `default_model` does not apply when using the `openai` p…
-
gen_vulkan_shaders failed at
```
Command::new(vulkan_shaders_gen_bin)
.args([
"--glslc".as_ref(), "glslc".as_ref(),
"--input-dir".as_ref(), vulkan_shaders_src…
-
### System Info
TGI version: 2.2.0 (but I tested 2.3.0 too)
Machine: 8x H100 (640 GPU RAM)
```
2024-09-25T14:29:44.260160Z INFO text_generation_launcher: Runtime environment:
Target: x86_64-unkn…
-
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior:
1. Dora start daemon: `dora up`
2. Start a new dataflow: `dora start …
-
Can't utilize GPU on Mac with
```
llama_cpp_rs = { git = "https://github.com/mdrokz/rust-llama.cpp", version = "0.3.0", features = [
"metal",
] }
```
Code
```
use llama_cpp_rs::{
opti…
-
Hi there,
First thank you for unsloth, it's great!
I've finetuned a llama-3-8b-Instruct-bnb-4bit and pushed it to hf hub. When I try to deploy it using [hf Inference Endpoints](https://huggingfa…
-
### Summary
There are various LLM inference libraries. WasmEdge already integrated llama.cpp, but we want to bring more to the community.
### Details
Already supported:
1. PyTorch
2. TFLi…
hydai updated
1 month ago
-
With the recent advent of large models (take Llama 3.1 405b, for example!), distributed inference support is a must! We currently support naive device mapping, which works by allowing a combination of…
-
I'm trying to deploy Llama3 8b on GKE using optimum but running into some troubles.
Following instructions here: https://github.com/huggingface/optimum-tpu/tree/main/text-generation-inference. I bu…