linear-attention-model Search Results

1000+ results
for linear-attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

lllyasviel/ControlNet #590

Continuing training of a ControlNet

Following [the tutorial](https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md#step-3---what-sd-model-do-you-want-to-control) I can successfully download SD, add ControlNet, and train it. …

AmitMY updated 9 months ago
2
NVIDIA/Megatron-LM #950

[BUG] when use --use-mcore-models and --overlap-param-gather…

**Describe the bug** When the sequence of calculation parameters (FP16/BF16) in the buffer is different from the forward execution sequence of the model: As a result, when the `--overlap-param-gather…

Kingsleyandher updated 2 weeks ago
2
comfyanonymous/ComfyUI #4132

CUDA error on ZLUDA: CUBLAS_STATUS_NOT_SUPPORTED when callin…

### Expected Behavior Normally, when I use CUDA on ZLUDA, the prompt should be executed: I am using an AMD Radeon Vega 8 Graphics GPU with the AMD Ryzen 5 3500U CPU. It should happen normally... if i…

avachon100510 updated 22 hours ago
5
OpenNMT/CTranslate2 #1496

bug: alibi + multi_query_attention crash

```python try: import transformers except ImportError: pass from ctranslate2.specs import ( transformer_spec, ) from ctranslate2.converters.transformers import TransformersConver…

nlpcat updated 10 months ago
2
axolotl-ai-cloud/axolotl #1650

Llama3 Lora training fails to output and save

### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports…

austinm1120 updated 4 weeks ago
1
CarperAI/trlx #601

OOM error with PEFT LoRA on Llama2-7B

### 🐛 Describe the bug I'm trying to finetune Llama2-7B (to reproduce the experiments in a paper) using PEFT LoRA (0.124% of trainable params). However, this results in an out-of-memory (OOM) error o…

arpaiva updated 2 weeks ago
1
huggingface/transformers #33358

Can't save quantized models

### System Info - `transformers` version: 4.44.2 - Platform: Linux-6.8.0-40-generic-x86_64-with-glibc2.35 - Python version: 3.10.12 - Huggingface_hub version: 0.24.6 - Safetensors version: 0.…

lamashnikov updated 6 days ago
10
xrsrke/pipegoose #40

Automatic module mapping using torch.fx

**Notes** - Location: `pipegoose.nn.parallel_mapping.ParallelMapping` - `module` is an instance in `model.named_modules()` - model is `AutoModelForCausalLM.from_pretrained()`, `torch.nn.Transformer…

xrsrke updated 10 months ago
3
TUDB-Labs/mLoRA #149

How to use this frame work to train a LLM with multi-GPUs?

Is the frame work support multi-gpu training? I want to use the frame work to train a 70B model, however, I did not find the parameter settings or methods for multi-gpus training.

zhhvvv updated 3 weeks ago
8
Beomi/InfiniTransformer #11

Model generating random sequence

By saving the model and reloading it I managed to get the model working, both with quantized and full precision (it still uses 10gb max of gpu ram). However, the model generates random characters. He…

Lazy3valuation updated 5 months ago
8

上一页 1...10 11 12 13 14 15 16...100 下一页

1000+ results for linear-attention-model

1000+ results
for linear-attention-model