-
I am trying to prune with
python main.py \
--model mistralai/Mistral-7B-Instruct-v0.2 \
--prune_method wanda \
--sparsity_ratio 0.5 \
--sparsity_type unstructured \
--save o…
-
## Description
Trying to use Cloudflare Workers AI models other then `cf/mistral/mistral-7b-instruct-v0.1`
## Steps to reproduce
For example, use `cf/meta/llama-3-8b-instruct` as `model_name`…
-
After running the default flow on Mistral in vLLM, there is a large (>100MB) report json in the directory I ran the commands. This seems quite heavy-weight, especially for a json file.
Instead, I …
mgoin updated
3 weeks ago
-
Specs: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#event_stream_format
As you can see, the specifications are rigidly defined, eliminating the neces…
-
**Problem:** The following fails when using models & base_models in the safetensors format
```
model = "Mistral-7B-v0.3"
base_model = "Mistral-7B-Instruct-v0.3"
watcher = ww.WeightWatcher(mode…
-
使用更改GitHub代码的方式更改模型,代码中更改了Llama 3.1,写法符合cloudflare模型和项目模型的格式。在重新部署后无法使用新模型
-
### Feature request
Support the recent larger embedding models of 7B or more parameters (20x larger than BERT-large)
### Motivation
The embedding models are being much larger than before in the pas…
ai-jz updated
5 months ago
-
In reviewing the updated ```docs``` I notice a few things that prompted some questions...
1) Neither AWQ/Int-4/```int32_float16``` are mentioned in the "Quantize on model conversion" nor "Quantize…
-
**Describe the bug**
Hi, all. Working on a blog article, following a mix of local documentation + Intelligent app workshop, but instead of going Falcon, I've gone with the Mistral 7b model. and at …
-
I tried #26 and gguf model type didn't get picked up by llm until I registered a model with "llm llama-cpp add-model". I'm not sure if this is working as intended - I expected that gguf would appear …