dcgym / iroko

A platform to test reinforcement learning policies in the datacenter setting.
Apache License 2.0
67 stars 22 forks source link

The Output files are empty? #35

Closed iBacklight closed 1 year ago

iBacklight commented 3 years ago

Hi,

Your works are pretty clever! However, I met some problems when trying to observe the result folder. The host server.out&err /client.out&err /ctrl.out are all empty. And for host ctrl.err it says "no such file or directory" anyway for PPO/PG/DDPG/DCTCP agent(those agents are used so far for my current work.) The program seems to be running smoothly. image

Any ideas to solve this? I tried to even commit _start_controller func but still nothing to show in those files but the .csv files have plenty of data in it, which I guess generated from goben?

fruffy commented 3 years ago

Thank you! What is the command that you are running the tests with?

iBacklight commented 3 years ago

Thanks for quick reply! its sudo -E python3 run_ray.py -a PPO

fruffy commented 3 years ago

Oh I see, yeah it is expected that .out and .err are empty. These are debug files and the hosts are configured to be silent during normal operation. Otherwise, these would quickly fill up and exhaust disk space. All the relevant information is collected in statistics.npy and the csvs.

If you run benchmark.py you should be able to run a sequence of algorithms and plot.py will plot the results for you. You can check these files to understand how the data is processed.

iBacklight commented 3 years ago

Thanks a lot! I'm running it right now. However, I still wonder why the host ctrl.err files give me no such file or directory info like: image

Thanks again!

fruffy commented 3 years ago

This should be fine. It has been a while since I last looked at this project but this message should be caused by https://github.com/dcgym/iroko/blob/master/contrib/go_ctrl/go_ctrl.go#L129 which removes any previous qdiscs attached to the interface. At startup there is no qdisc attached. However, when we restart over multiple episodes, preceding qdiscs need to be cleared.

I just did a quick test run and the control does work despite the error message.

dumbbell_udp_5e

iBacklight commented 3 years ago

Thanks for your brilliant contribution. I have run a few algorithms in these days. It looks like that for some simple topologies and simple traffic, the dctcp would perform better than most RL algorithms?

I think this would be helpful for my understanding of SDN and RL applications on network development. Thanks again.

fruffy commented 3 years ago

Yes, that would not be surprising. All the algorithms we use are out of the box and unoptimized. DCTCP on the other hand has received significant improvements over recent kernel versions (4.15+). In specific congestion scenarios, PPO may still be able to outperform DCTCP however.