Can someone walk me through this 3 lines of code?

llSourcell / Reinforcement_Learning_for_Stock_Prediction

This is the code for "Reinforcement Learning for Stock Prediction" By Siraj Raval on Youtube

638 stars 362 forks source link

I'm trying to implement this in tensorflow js but i'm running into some issue with these lines in agent.py:
target_f = self.model.predict(state)
target_f[0][action] = target
self.model.fit(state, target_f, epochs=1, verbose=0)
What does target_f means? final target?

What is target[action] ? Why are we setting target_f to the predict and then assingning it something else?

thanks, Tiago

That means we only want to update value of THE ACTION that we choose in the experience, target is estimate value (reward and future value) of THE ACTION, and target_f is the current value, notice that loss function is 'mse', and target_f is the output of state, so in loss calculation, only the index of THE ACTION is not 0 because it's value is covered by target in target_f[0][action] = target, may this help you.

llSourcell / Reinforcement_Learning_for_Stock_Prediction

Can someone walk me through this 3 lines of code? #18