hi, thanks for the stackllama great work, I run the rl experiment but met following userwarning, does this matter for rl model training?
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
0%| | 1/2615 [01:23<60:49:43, 83.77s/it]/data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.18 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters.
warnings.warn(
/data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.07 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters.
warnings.warn(
0%| | 2/2615 [02:49<61:32:03, 84.78s/it]/data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.51 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters.
warnings.warn(
0%| | 3/2615 [04:07<59:17:45, 81.72s/it]/data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.50 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters.
warnings.warn(
/data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.15 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters.
warnings.warn(
0%| | 6/2615 [08:27<62:31:58, 86.29s/it]/data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.08 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters.
warnings.warn(
/data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.09 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters.
warnings.warn(
0%| | 8/2615 [11:17<62:01:16, 85.64s/it]/data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.11 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters.
warnings.warn(
0%| | 10/2615 [14:05<61:11:22, 84.56s/it]/root/code/transformers/src/transformers/pipelines/base.py:1080: UserWarning: You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset
That should be fine - only if it continues to become more negative after longer training it is an issue. @younesbelkada maybe we can only warn when it's e.g. <-1?
hi, thanks for the stackllama great work, I run the rl experiment but met following userwarning, does this matter for rl model training?
the environment I used is , which is showed in https://github.com/lvwerra/trl/issues/343#issuecomment-1537144381
layer_norm_names
in model definition fuctionLook forward for your reply. Thanks.