-
```
raise NotImplementedError("flash attention is not installed")
NotImplementedError: flash attention is not installed
2024-09-06T05:28:11.268655Z ERROR shard-manager: lorax_launch…
-
您好,正在尝试微调mixtral 8x7b,但是训练一段时间后loss不再下降,输出也有些问题
使用的config如下:
```python
# Copyright (c) OpenMMLab. All rights reserved.
import torch
from datasets import load_dataset
from mmengine.dataset im…
-
Thank you for the great repo.
Is there any plan from your side to update the code for LLaMA model? or is there anything I can do to update the codes to visualize LLaMA model?
-
I use `train_with_template.py` with `mistralai/Mistral-7B-Instruct-v0.2`
```
torchrun --nproc_per_node=2 --master_port=20001 fastchat/train/train_with_template.py \
--model_name_or_path mistr…
-
Trying to use this ChatMistralAI for streaming from this instruction https://js.langchain.com/docs/integrations/chat/mistral but got error
```
TypeError: Failed to execute 'decode' on 'TextDecoder…
-
```python
from dotenv import load_dotenv
import torch
from mistralai.client import MistralClient
print(torch.tensor([1, 2, 3]))
load_dotenv()
mistral = MistralClient()
print("hello world"…
-
Hi,
Great work on this! Is Mistral supported? Right now I only see GPT-J and Llama 2.
Thank you!
-
Claude is a strong language model that many users would like to use for their application. If the SK team thinks this something that this is be valuable for semantic-kernel, I'd like to help…
-
Error:
```
[1707364943] update_slots : failed to find free space in the KV cache, retrying with smaller n_batch = 256
[1707364943] update_slots : failed to find free space in the KV cache, retryi…
-
[LongLora](https://arxiv.org/abs/2309.12307) is "an efficient fine-tuning approach that extends the context sizes of pre-trained large language models". They propose to fine-tune a model with a sparse…