Open 4daJKong opened 1 year ago
NVIDIA GPU: T4
NVIDIA Driver Version: 515.105.01
CUDA Version: 11.7
CUDNN Version: 8.9.4.25_cuda11
Operating System: CentOS Linux release 7.6.1810
Python Version (if applicable): 3.8.17
PyTorch Version (if applicable): 2.0.1
Describe the bug I take facebook opt-350m as actor model and opt-125m as critic model and successfully finished step1 and step 2. While evaluating them by
eval.py
provided in evaluation_scripts, actor model can return a SFT result in step1.Besides, in step2, I train 5 epochs and here is some log,
when I run eval.py to evaluate it, it shows,
So I guess there is no fatal error here...
After that, I continue to run step3, but when I add
--enable_hybrid_engine
, it showsI tried to add
in
train_rlhf
func before return value inppo_trainer.py
Or
taking
ACTOR_ZERO_STAGE
andCRITIC_ZERO_STAGE
to 0 but both of them didn't workSo have to remove
--enable_hybrid_engine
but the training.log (I attached it and actor model config training.log config.txt below)look like that,
and it didn't shows this kind of message when I check other's training result in huggingface
I was wondering if
--enable_hybrid_engine
, this parameter cause this problem, if so how could I correct? If not, why it shows a weired result if I use my trained actor model?it shows,
[{'generated_text': 'Do you know Microsoft?ekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrekrek...'}]