llSourcell / Reinforcement_Learning_for_Stock_Prediction

This is the code for "Reinforcement Learning for Stock Prediction" By Siraj Raval on Youtube
638 stars 362 forks source link

Can someone walk me through this 3 lines of code? #18

Open talvasconcelos opened 6 years ago

talvasconcelos commented 6 years ago

I'm trying to implement this in tensorflow js but i'm running into some issue with these lines in agent.py:

target_f = self.model.predict(state)
target_f[0][action] = target
self.model.fit(state, target_f, epochs=1, verbose=0)

What does target_f means? final target?

What is target[action] ? Why are we setting target_f to the predict and then assingning it something else?

thanks, Tiago

laox1ao commented 5 years ago

I'm trying to implement this in tensorflow js but i'm running into some issue with these lines in agent.py:

target_f = self.model.predict(state)
target_f[0][action] = target
self.model.fit(state, target_f, epochs=1, verbose=0)

What does target_f means? final target?

What is target[action] ? Why are we setting target_f to the predict and then assingning it something else?

thanks, Tiago

That means we only want to update value of THE ACTION that we choose in the experience, target is estimate value (reward and future value) of THE ACTION, and target_f is the current value, notice that loss function is 'mse', and target_f is the output of state, so in loss calculation, only the index of THE ACTION is not 0 because it's value is covered by target in target_f[0][action] = target, may this help you.