-
https://github.com/ModelTC/Outlier_Suppression_Plus/blob/3ba97ae2dab0e6e5ead5da1795f50fd47025a49d/quant_transformer/model/quant_llama.py#L242 hi~ i noticed that there are two types LN i.e. pre-LN&pos…
-
I tried to reproduce the results of data2vec using the open source configuration, but the performance was rather poor. So I compared the parameters in the public model [audio_base_ls.pt](https://dl.fb…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### Reproduction
CUDA_VISIBLE_DEVICES=0,1 accelerate launch \
--config_file config.yaml \
src/train_bash.py \
…
-
I'm trying to Train my own Model with Windows, (since kohya_ss wouldn't launch on Linux). It endet up launching on Windows but everytime I try to start training it gets stuck on "Command executed", be…
-
# Proposal: gain maps for PNG
**This proposal has no official standing in PNG WG and is presented for discussion only. Do not implement.**
## [3 Terms, definitions, and abbreviated terms](https:…
-
Hello, very excellent work! I have a small question. In your paper, under the section "EXPERIMENTS D Application to Video Compression," you mention, "At the second stage, all the components in Fig. …
-
## Typology of Efficient Training
- Data & Model Parallel
- Data Parallel
- Tensor Parallel
- Pipeline Paralle
- Zero Redundancy Optimizer(ZeRO) (DeepSpeed, often work with CPU offloadi…
-
In this issue you can either:
- **Add papers** that you think are interesting to read and discuss (please stick to the format).
- **vote**: should be done using :+1: on comments
-
## Keyword: sgd
### Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability
- **Authors:** Authors: Haoyi Xiong, Xuhong Li, Boyang Yu, Zhanxing Zhu, Dongrui Wu, Dejin…
-
我使用ppo ray进行训练。训练会正常的进行若干步,随后出现错误
```
File "/tmp/ray/session_2024-05-24_17-35-31_318483_337945/runtime_resources/working_dir_files/_ray_pkg_d887115d5fd5f465/openrlhf/trainer/ray/ppo_actor.py", line …