mistral-large Search Results

1000+ results
for mistral-large

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

theroyallab/tabbyAPI #166

[BUG] speculative decoding too slow

### OS Linux ### GPU Library CUDA 12.x ### Python version 3.11 ### Describe the bug When running exllamav2's inference_speculative.py example with llama 3.1 8B 2.25bpw as draft and 70B 4.5bpw a…

randoentity updated 3 months ago
3
vllm-project/vllm #8194

[Bug]: vllm.engine.async_llm_engine.AsyncEngineDeadError

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A…

NicolasDrapier updated 2 months ago
13
Lareina2441/LLaVA-Med #1

作者的自言自语。。。

UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.) device: torch.device = torch.device("cpu"), Models: ['llavamed']

Lareina2441 updated 1 month ago
35
WildEval/ZeroEval #1

ZebraLogic LLM Requests

If you could test the following on ZebraLogic that would be great (from reddit LocalLlama community: 1) Wizard 8x22b 2) Mixtral 8x22b 3) Mixtral 8x7b 4) Command-r-plus 5) Mistral Nemo 12B 6) Lla…

jd-3d updated 3 months ago
9
mamei16/LLM_Web_search #27

Open URL Issue

**Describe the bug** Every time I try to open a URL it fails to do so, I have copped the code exactly from regular expressions into the regex. **Expected behavior** I am assuming that its suppose…

Phoenix13579 updated 1 month ago
4
arcee-ai/mergekit #267

Idea regarding the new 8x22b Mixtral model and the inverse o…

I see people are trying to extract the Mistral-22b ancestor from the MoE model by averaging the MLP layers and wondered if the 'model stock' method in Mergekit could be inverted: - Use the averaged…

jukofyork updated 7 months ago
20
huggingface/transformers #30860

Improving memory efficiency further 🚀

### Feature request Removing the line `logits = logits.float()` in most `ModelForCausalLM`. This would allow to save a lot of memory for models with large vocabulary size. This allows to divide the…

Cyrilvallez updated 2 months ago
7
jasonjmcghee/rem #17

Add embedding search

rem should index all text via embedding store. We could use something like https://github.com/asg017/sqlite-vss If we go this route we should fork / open a PR to add the extension https://github…

jasonjmcghee updated 6 months ago
29
huggingface/trl #1025

DPO models generate multiple / corrupted responses

Hi, I am running some tests with DPOTrainer to see how it works but I have encountered some problems during the inference phase of the generated model. In details, this is the pipeline of operations I…

Devy99 updated 2 months ago
57
mistralai/client-python #62

ChatCompletionResponse Pydantic Enum Error

Code I had yesterday that functioned quit working today. This could be an inference endpoint change for mistral-large-latest but I figured I would list it here as well as it is probably easier (as in…

sploithunter updated 3 months ago
2

上一页 1...81 82 83 84 85 86 87...100 下一页

1000+ results for mistral-large

1000+ results
for mistral-large