Open stianteien opened 1 year ago
Monitor loss for all agents to see if it learns something new.
Loss not symmetric for the same actions..?
Print y_target as well, to see the loss distance from right answer.
Same loss all the time due to random learning patterns from the memory.
Monitor loss for all agents to see if it learns something new.