Policy Gradient Issue: ValueError: Shapes (20, 1) and (20, 2) are incompatible

adventuresinML / adventures-in-ml-code

This repository holds all the code for the site http://www.adventuresinmachinelearning.com

1.04k stars 640 forks source link

Policy Gradient Issue: ValueError: Shapes (20, 1) and (20, 2) are incompatible #27

Open danisch-khurshid-creator opened 4 years ago

danisch-khurshid-creator commented 4 years ago

Hi. The code Code is not working with this line: loss = network.train_on_batch(states, discounted_rewards).

asokraju commented 4 years ago

Try this... it should work... target_actions = np.array([[1 if a==i else 0 for i in range(2)] for a in actions]) loss = network.train_on_batch(states,target_actions, sample_weight=discounted_rewards)