issues
search
EricLBuehler
/
mistral.rs
Blazingly fast LLM inference.
MIT License
1.69k
stars
143
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Prepare to accept multiple model types
#369
EricLBuehler
closed
1 day ago
1
Bump version to 0.1.12
#368
EricLBuehler
closed
1 day ago
1
Clamp n device layers to n model layers
#367
EricLBuehler
closed
1 day ago
1
Store and load prefix cache on disk
#366
EricLBuehler
opened
2 days ago
1
Speed in --interactive mode
#365
fohvok
opened
2 days ago
0
refactor: DRY `varbuilder_utils.rs`
#364
polarathene
opened
2 days ago
3
Allow default unigram unk token for GGUF
#363
EricLBuehler
closed
2 days ago
1
Fix unauth check
#362
EricLBuehler
closed
3 days ago
1
fix: Ensure committed files are normalized to LF
#361
polarathene
closed
3 days ago
1
Fix no auth token for local loading
#360
EricLBuehler
closed
3 days ago
1
Disable cublaslt if using f16 kernels
#359
EricLBuehler
closed
2 days ago
1
remove tracing_subscriber.init() from Loader
#358
Jeadie
opened
4 days ago
1
Add an example
#357
EricLBuehler
closed
4 days ago
1
refactor: GGUF + GGML Loaders with `ModelKind`
#356
polarathene
closed
2 days ago
7
Support GGUF Mixtral format where experts are in one tensor
#355
EricLBuehler
opened
5 days ago
1
Refactor `deserialize_chat_template`
#354
Jeadie
closed
5 days ago
2
Add a verbose mode
#353
EricLBuehler
closed
5 days ago
1
dolphin-2.9-mixtral-8x22b.Q8_0.gguf "Error: cannot find tensor info for blk.0.ffn_gate.0.weight"?
#352
psyv282j9d
opened
5 days ago
7
Implement the Phi 3 vision model
#351
EricLBuehler
opened
5 days ago
1
Allow subsets of sequences in prefix cacher
#350
EricLBuehler
opened
6 days ago
3
Propogating Regex init error
#349
gregszumel
closed
5 days ago
2
Expose some APIs on the Rust side
#348
EricLBuehler
closed
6 days ago
1
Enabling prefix cache for llama3 gguf
#347
joshpopelka20gmail
opened
6 days ago
4
Set device to cpu if loading isq
#346
EricLBuehler
closed
1 week ago
1
Add support for using GGUF tokenizer
#345
EricLBuehler
closed
4 days ago
6
Insitu quantization OOM for large models
#344
nidhoggr-nil
opened
1 week ago
1
Update dependencies
#343
EricLBuehler
closed
1 week ago
1
Python mistralrs-cuda not running on GPU
#342
joshpopelka20
closed
1 week ago
3
Implement cache shifting for Llama models
#341
EricLBuehler
closed
1 week ago
1
Fix mistral model repeat kv
#340
EricLBuehler
closed
1 week ago
1
Garbled output on very long prompts
#339
LLukas22
opened
1 week ago
3
Refactor layers.rs
#338
EricLBuehler
closed
1 week ago
1
Add a C FFI
#337
EricLBuehler
opened
1 week ago
3
Use different RoPE impl for bs=1
#336
EricLBuehler
closed
1 week ago
2
refactor: `ModelKind` with `strum` + `derive_more`
#335
polarathene
closed
1 week ago
12
chore: Use `strum` to simplify `GGUFArchitecture` maintenance
#334
polarathene
closed
1 week ago
8
Remove candle-layer-norm dep
#333
EricLBuehler
closed
1 week ago
2
Fixes and verbosity improvements for device mapping
#332
EricLBuehler
closed
2 weeks ago
1
chore: `SimpleModelPaths` should be renamed to `LocalModelPaths`
#331
polarathene
closed
2 weeks ago
2
Benching local GGUF model layers allocated to vRAM but no GPU activity
#330
polarathene
closed
2 days ago
3
bug: If device layers requested exceed model layers, host layers overflow
#329
polarathene
opened
2 weeks ago
11
chore: Simplify `utils/token.rs:get_token()`
#328
polarathene
closed
1 week ago
2
Improve chat templates docs
#327
EricLBuehler
closed
2 weeks ago
1
Running model from a GGUF file, only
#326
MoonRide303
opened
2 weeks ago
31
Use cuBLASlt in attention
#325
EricLBuehler
closed
2 weeks ago
2
Implement Nomic Text Embed
#324
EricLBuehler
opened
2 weeks ago
1
Mistral rs python binding error
#323
shresht8
opened
2 weeks ago
6
Use PromptTemplate for custom HuggingFace model
#322
joshpopelka20
closed
1 week ago
3
Don't force QLlama to have >2 input dims @Jeadie
#320
Jeadie
closed
2 weeks ago
4
New `Unexpected rank, expected 3, got: 2`
#319
Jeadie
closed
2 weeks ago
3
Next