-
### Is your feature request related to a problem? / 你想要的功能和什么问题相关?
https://replicate.com/nateraw/mixtral-8x7b-32kseqlen
### Describe the solution you'd like. / 你想要的解决方案是什么?
https://replicate.com/na…
-
### Before submitting your bug report
- [X] I believe this is a bug. I'll try to join the [Continue Discord](https://discord.gg/NWtdYexhMs) for questions
- [X] I'm not able to find an [open issue](ht…
-
### System Info
Using Docker server
```
model=mistralai/Mistral-7B-Instruct-v0.1
volume=$PWD/data
docker run --gpus '"device=3"' --shm-size 1g -p 8080:80 -v $volume:/data \
-e HUGGING_…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a sim…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a…
-
quantized compiled using --> cargo build --example quantized -r --features metal
Unsure of... how many layers accelerated / how many threads used / clearly different sample stages
..yet I pres…
-
您好,复现的时候发现这类问题,是否对结果有影响?Mistral-7B-Instruct-v0.2 and are newly initialized: ['model.layers.0.self_attn.ultragist_k_proj.weight', 'model.layers.0.self_attn.ultragist_o_proj.weight', 'model.layers.0.sel…
-
The model artifact from a tuning is `run_name-model`, which means that if you want to load a model for an eval (e.g. lm-harness) where the run will be updated with eval results, you have to write the …
-
Source code in `Accelerate` lib shows that `weights` in hooks is empty if the training task is launched via Deepspeed.
https://github.com/huggingface/accelerate/blob/b8c85839531ded28efb77c32e0ad85…
-
### Python Version
```shell
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
```
### Pip Freeze
```shell
absl-py==2.1.0
annotated-types==0.7.0
anyio==4.0.0
argon2-cffi==23.1.…