yd-kwon / POMO

codes for the paper "POMO: Policy Optimization with Multiple Optima for Reinforcement Learning"
138 stars 39 forks source link

I have a question #7

Closed WYF99111 closed 8 months ago

WYF99111 commented 12 months ago

Hello, I ran the train_n100.py file directly, why the loss is negative