v-drone / Single_Agent_minigrid

1 stars 0 forks source link

test result #1

Open seventheli opened 3 years ago

seventheli commented 3 years ago

model structure SimpleStack( (map_decode1): Sequential( (0): Dense(None -> 64, Activation(tanh)) (1): Dense(None -> 64, Activation(tanh)) (2): Dense(None -> 64, Activation(tanh)) (3): Dense(None -> 32, Activation(tanh)) (4): Dense(None -> 16, Activation(tanh)) ) (decision_making): Sequential( (0): Dense(None -> 3, Activation(sigmoid)) ) )

seventheli commented 3 years ago

reward function

if agent arrival the goal, then reward = 1, else 0

alt text alt text

alt text alt text

alt text alt text

the average step used in 50000 round: 13.22246/10.54966

seventheli commented 3 years ago

reward function


the average step used 8.87086 / 8.8623

seventheli commented 3 years ago

reward function


the average step used 8.274410253235093/8.246965178308901

seventheli commented 3 years ago

reward function


the average step used 9.19134/46.77606