RealVNF / distributed-drl-coordination

Distributed Online Service Coordination Using Deep Reinforcement Learning
19 stars 6 forks source link

Hi, why 'flow_size_shape' is so small, is it practical? thanks #1

Closed bradley-code-again closed 2 years ago

bradley-code-again commented 2 years ago

Hi, thanks for your opend source code. I am trying to understand your code and may cite your paper in my future work. I am wondering why you set 'flow_size_shape' so small for deterministic setting, which means the duration of the flow is only 1 time unit if your 'flow_dr_mean' is 1 by default, while your link propagation delay or VNF process delay is serval time units which means your will release your flows very quickly after your create them? Can I set this 'flow_size_shape' much larger than the propagation delay and what is the 'run_duration' in your code and how to use it? The default or 'run_duration' is 100, why? thanks.

stefanbschneider commented 2 years ago

Hi @bradley-code-again , thanks for your interest and sorry for the late response. These are good questions!

To be honest, I do not know what exactly is realistic and I do believe that there are different scenarios in practice, ranging from scenarios with few and long flows (large backups, long video streams, semantically grouped traffic) to scenarios with many, short flows (many short, low-level requests). In this work, we focused on the latter, i.e., many, short flows, which presents a larger challenge in terms of scalability - and this is also the motivation for such a distributed approach. Since we were not sure about what values are realistic, we kept unit size and data rate (=1) as default.

I believe run_duration is a leftover from our previous, centralized approach DeepCoord. It is not used in this distributed DRL approach and you can ignore it (@qarawlus Is that correct? Do you remember? ) In the DeepCoord approach, the centralized DRL agent would configure coordination and scheduling rules that are deployed and applied locally in the network. To coordinate network and service, the agent would periodically update these rules as configured by run_duration. Here, 100 is just an example (e.g., corresponding to 100ms), which seems like a plausible value.

Does this help?

bradley-code-again commented 2 years ago

Hi Stefan, many thanks for your answer, that's very helpful. I'll have another read on you paper and code, thanks.