Closed fanyuzeng closed 6 years ago
https://github.com/tgangwani/GA3C-DeepNavigation/blob/master/NetworkVP.py In that project, self.enc_out is the encoder's output, which follows the networks in original paper.
self.aux2 = tf.concat((self.enc_out, lstm_outputs_1), axis=1)
self.aux2 = tf.concat((self.aux2, self.aux_inp), axis=1)
Sry to interrupt, I started training but it didn't stop until I shut it down. I saw your last issue. Do you know how to set train episodes? Thanks:)
@fanyuzeng
You are right. It should use encoder's output, not flat.
@harryzheng93
Sorry, I forget to set a parameter controlling max training episode.
If you want to do that, you can take a look at
https://github.com/zeus7777777/nav_a3c/blob/master/agent.py#L111
which use self.global_episode
to determine total episode.
And thanks for commenting on my code, which is not easy to read and contains several mistake. I'll rewrite this repo in the future if I have leisure time.
@zeus7777777 Thank you so much.
https://github.com/zeus7777777/nav_a3c/blob/52d58830a8dd9c8e17522a90305990ef10adde63/network.py#L51
In the paper 'Learning to navigate in complex environments', input of the second lstm is made up of encoder's output and previous velocity as well as previous action. And the encoder's output is fc1 in your code, but why do you use flat instead of fc1? Best regards.