Open csingh27 opened 3 years ago
Tried : Changed actor_loss = critic_value - log_probs from actor_loss = log_probs - critic_value
Does not work !
Actor loss, critic loss, value loss all are increasing.
cloth_corner change action to []7 from []3 also taking corner info into account.
Actions might not be defined correctly. NaN might be not enough.
Value loss, critic loss, actor loss are all increasing and not decreasing.