transformer-models Search Results

1000+ results
for transformer-models

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #33130

Is it possible to add L1/L2 regularization using the trainer…

### Feature request I want to add L1/L2 regularization to the transformer training. ### Motivation Adding L1/L2 reg can promote sparser models that can accelerate inference and reduce storage. ###…

mayank64ce updated 2 months ago
3
zhengzangw/Sequence-Scheduling #4

torch.cuda.OutOfMemoryError: CUDA out of memory.

hi！when I try to running your demo in PiA part, I get an error in 'instruction tuning' step: ``` root@0de6f5c3da0f:/workspace/zt/code/Sequence-Scheduling# bash train.sh [2024-10-02 22:24:40,711] …

Noblezhong updated 1 month ago
2
elastic/eland #546

Add ability to use ClipTokenizer

I have a large datasets that was encoded with Clip 32 but I cannot deploy that in ES because it uses the ClipTokenizer. Is there a way to add this?

scottroot updated 2 weeks ago
1
livepeer/ai-worker #145

LLM: 8 bit quantization occasional matrix multiplication err…

### Describe the bug When using 8-bit quantization with the LLM pipeline and a multiple GPU setup, it mostly runs fine. After some random amount of requests however the pipeline starts failing an…

kyriediculous updated 1 month ago
1
huggingface/trl #2329

RLooTrainer bug when using deepspeed

### System Info When using DeepSpeed, the RLOOTrainer reports an error: "ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any function…

macheng6 updated 1 week ago
4
NousResearch/Hermes-Function-Calling #31

Can't start function calling script on a linux machine

Here's the Error: ``` python functioncall.py --query "I need the current stock price of Tesla (TSLA)" …

tybalex updated 4 months ago
1
aws-neuron/aws-neuron-sdk #825

Neuron compilation failed for LIama-2 70B with Optimum-neuro…

I started a `inf2.48xlarge` ec2, pull and get into [TGI-Neuron DLC with optimum-neuron 0.0.17 installed](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-hf-tgi-0.0.17-pt-1.13.1-inf-n…

Neo9061 updated 1 week ago
3
xdit-project/xDiT #350

Can comfyui-xdit run in multiple servers?

Hello developers, I'm trying to use xDiT (version 3.3) comfyui-xdit on 2 servers with 4 NVIDIA 3090 GPUs. I use the command below to start the service: ``` torchrun --nproc_per_node=2 --nnodes=…

VincentXWD updated 6 days ago
3
huggingface/transformers #16135

Improve EncoderDecoderModel docs

## **First good issue** There have been quite some issues/questions with how to use the Encoder-Decoder model, e.g.: https://github.com/huggingface/transformers/issues/4483 and https://github.com/h…

patrickvonplaten updated 1 month ago
19
cocktailpeanut/fluxgym #207

Training doesn't start "Command exited with code 1"

I'm running on Windows 10, my FLUX and many other AI repos work flawlessly even the most error prone ones like Tortoise TTS however I can't fix an error while running FLUXGYM. The AI captions generate…

tempyoutub updated 3 weeks ago
4

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for transformer-models

1000+ results
for transformer-models