tkn-tub / ns3-gym

ns3-gym - The Playground for Reinforcement Learning in Networking Research
GNU General Public License v2.0
521 stars 197 forks source link

Time-based Interface Between Environment and Agent #20

Closed shahrukhkasi closed 4 years ago

shahrukhkasi commented 4 years ago

Hi,

I have been experimenting on the test_tcp.py using a time-based interface between environment and agent. I set the tcpEnvTimeStep = 0.1 sec in sim.cc to execute the action from an agent and then send the state information to agent after every 0.1 seconds. The obtained output is out of synchronization and seems like the agent and simulations environment are updating values at different times. For example, this is the output for first 10 step calls which prints episode, step, state of 16 elements, reward, done, info:

Episode:  0  step:  1 [1, 1, 363241, 2, 4294967295, 340, 340, 0, 0, 0, 0, 0, 180000, 0, 0, 0] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  2 [1, 1, 463241, 2, 4294967295, 341, 340, 0, 0, 0, 0, 0, 180000, 182571, 0, 0] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  3 [1, 1, 563241, 2, 4294967295, 341, 340, 0, 0, 0, 0, 0, 180000, 0, 0, 0] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  4 [1, 1, 663241, 2, 4294967295, 341, 340, 0, 0, 0, 0, 0, 180000, 0, 0, 0] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  5 [1, 1, 763241, 2, 4294967295, 342, 340, 340, 340, 1, 1, 205578, 180000, 382528, 382528, 3400] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  6 [1, 1, 863241, 2, 4294967295, 342, 340, 0, 0, 0, 0, 0, 180000, 0, 0, 0] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  7 [1, 1, 963241, 2, 4294967295, 342, 340, 0, 0, 0, 0, 0, 180000, 0, 0, 0] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  8 [1, 1, 1063241, 2, 4294967295, 342, 340, 0, 0, 0, 0, 0, 180000, 0, 0, 0] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  9 [1, 1, 1163241, 2, 4294967295, 343, 340, 340, 340, 1, 1, 227755, 180000, 382528, 382528, 3400] 0.0 False {}
----------------------------------------------------------------------------------------
Episode:  0  step:  10 [1, 1, 1263241, 2, 4294967295, 343, 340, 0, 0, 0, 0, 0, 180000, 0, 0, 0] 0.0 False {}

On each time step I am increasing the 6th element in the state by 1, but it's not showing up like that. Can someone please explain what can be the probable reason for this desynchronization? When I set the tcpEnvTimeStep = 1 or higher, it works fine.