gradient-projection Search Results

1000+ results
for gradient-projection

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel/intel-extension-for-pytorch #552

RuntimeError: tensor does not have a device when training Pi…

### Describe the bug I was trying to run the lora training script of PixArt-alpha: https://github.com/PixArt-alpha/PixArt-alpha/blob/master/train_scripts/train_pixart_lora_hf.py but got Runtime Error…

congdm updated 4 months ago
8
facebookresearch/dino #43

model collapse after a few steps

I use custom data to train DINO, the model seems collapsed after a few steps, the feature seems to be uniform. I use larger teacher temputure to enhance "sharping", but the model collapsed after all.…

Doom9234 updated 2 months ago
11
jiaweizzhao/GaLore #14

Confusion about the paper

Impressive and insightful work, hooray to the authors! Recently I read your paper, but I'm comfused about the following parts. 1. In the abstract, you discuss how memory-reduction approaches like LoR…

CrazyElements updated 7 months ago
2
jupyterlab/jupyterlab #7506

help-extension: Enable allow-same-origin to fix broken searc…

## Description Relatively minor, but explicitly omitting `allow-same-origin` from the help widget iframe `sandbox` attribute in packages/help-extension breaks search pages on many reference documen…

jkromwijk updated 1 year ago
3
Kinpzz/Deep-Learning-on-Medical-Image #8

Paper Notes

Kinpzz updated 7 years ago
3
LinWeizheDragon/FLMR #16

question about details of finetuning script

hi lin, i managed to write a finetuning script, could you help me check it? i also got confused about some details, listed below(also marked with NOTE in code comments), could you illustrate somehow? …

Maxlinn updated 4 months ago
9
huggingface/peft #1883

After fine-tuning, the model inference is abnormal

### System Info When I used P-tuning V2 to fine-tune GLM, the loss reduction was very noticeable, but in the actual inference, I made a lot of noise. I use the training data again for inference. Exa…

mumu029 updated 2 months ago
2
hiyouga/LLaMA-Factory #3638

长上下文OOM问题: llama3 8B+ flash attention2 +unsloth+ 4bit +100k上…

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction 参考https://github.com/hiyouga/LLaMA-Factory/wiki/Performance-comparison，使用 [llama 8B](https://hf-mirror.c…

shumuha updated 5 months ago
1
hiyouga/LLaMA-Factory #4296

对ChatGLM2-6B模型进行Lora微调时分布式训练出现错误

### Reminder - [X] I have read the README and searched the existing issues. ### System Info bin C:\Users\luoxiaojie\.conda\envs\pytorch212-lxj\lib\site-packages\bitsandbytes\libbitsandbytes_cuda121…

LiXibat-ai updated 4 months ago
1
facebookresearch/SimulEval #18

Pre- and post-processing text in Simuleval

I am playing with the MMA-hard model to replicate WMT15 DE-EN experiments reported in the paper and my question is about preprocessing and postprocessing data. The paper says that: > For each data…

kurtisxx updated 2 years ago
22

上一页 1...73 74 75 76 77 78 79...100 下一页

1000+ results for gradient-projection

1000+ results
for gradient-projection