mistral-7b-instruct Search Results

1000+ results
for mistral-7b-instruct

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

unslothai/unsloth #960

KeyError: 'base_model.model.model.layers.0.mlp.down_proj.lor…

Hi, I'm trying to fine-tune the Llama3.1 8b model but after fine-tuning it uploading it to HF, and when trying to run it using vLLM I get this error "KeyError: 'base_model.model.model.layers.0.mlp.dow…

Iven2132 updated 5 days ago
3
SillyTavern/SillyTavern #2870

[BUG] Issues with the Mistral Tokenizer

### Environment 🐧 Linux ### System N/A ### Version SillyTavern 1.12.5 'staging' (38d24f4b) ### Desktop Information _No response_ ### Describe the problem Mistral's tokenizer is weird and we p…

honey-tree updated 1 day ago
19
axolotl-ai-cloud/axolotl #1916

PR #1756 breaks last turn tokenization for phi-3.5

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports. ### Exp…

fozziethebeat updated 4 days ago
3
unslothai/unsloth #805

mistral large 2 support 请支持mistral large 2 support

https://www.youtube.com/@AIsuperdomain

win4r updated 1 month ago
3
janhq/cortex.cpp #1263

bug: `cortex pull author/model_id` && `cortex pull model_id`…

### Cortex version 0.5.0-68 (actually 67) ### Describe the Bug From #1239: When pulling a model from HF /, I expect to display its variant options and allow user to select. Pull directly from H…

gabrielle-ong updated 6 hours ago
1
unslothai/unsloth #986

Gemma batch inference much slower than Mistral

Hi. Raising this issue as I am experimenting a much slower inference time with Gemma-1 models. > Environment: > - xformers 0.0.26.post1 pypi_0 pypi > - unsloth …

lctdulac updated 1 week ago
4
cloudflare/workers-sdk #6576

🐛 BUG: ✘ [ERROR] 🚨 Couldn't upload file: ReferenceError: Blo…

### Which Cloudflare product(s) does this pertain to? Wrangler ### What version(s) of the tool(s) are you using? Wrangler 3.72.2 ### What version of Node are you using? 16.15.1 ### W…

PKD667 updated 3 weeks ago
1
mistralai/mistral-inference #148

Did Mistral-7B-Instruct-v0.2 use Sliding Window Attention (S…

I have been fine-tuning Mistral-7B-Instruct-v0.2 recently and I noticed that when I don't use SWA and train with a sequence length of 32K, the initial loss is unusually high (6.0). However, when I tra…

matrixssy updated 3 months ago
2
TransformerLensOrg/TransformerLens #708

[Bug Report] Gemma-2-2b not found

If you are submitting a bug report, please fill in the following details and use the tag [bug]. **Describe the bug** Gemma-2-{size} is not loadable using from_pretrained. I checked OFFICIAL_MODEL_…

jasonlim131 updated 3 weeks ago
1
omriel1/llm2vec #3

Applying LLM2Vec to DictaLM

In order to apply LLM2Vec to DictaLM we need: - [ ] Identify base model - https://huggingface.co/collections/dicta-il/dicta-lm-20-collection-661bbda397df671e4a430c27 - [ ] Prepare dataset for MNTP…

omriel1 updated 5 days ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for mistral-7b-instruct

1000+ results
for mistral-7b-instruct