RealVNF / DeepCoord

DeepCoord: Self-Learning Network and Service Coordination Using Deep Reinforcement Learning
MIT License
26 stars 14 forks source link

How to implement the DeepCoord with SAC agent #2

Closed huanghub6224 closed 3 years ago

huanghub6224 commented 3 years ago

Hi, thank you for your response to the last issue, it works.
And I wonder if I could use other DRL algorithms (e.g., soft actor-critic, sac) to realize it.

I saw the options"# Agent type: SAC or DDPG", in agent_congfig YAML files, (1) so can I just change the agent type by defined "Agent type: SAC? (2) or should I write another python file to defined the sac agent like rlsp_ddpg.py? I noticed that there is rlsp_ddpg.py in src/rlsp/agents, but didn't have rlsp_sac.py. (3) if I want to implement DeepCoord by sac agent, What other changes need to be made?

Looking forward to your reply!

stefanbschneider commented 3 years ago

Hi, thanks for your interest in our work.

Indeed, it's possible to use other DRL algorithms that support continuous actions (like SAC) to realize our approach.

The code published here does not support any other algorithm than DDPG though, so simply changing agent type to SAC does not work here.

We made some preliminary experiments with SAC as well; so the configuration option is a leftover from these earlier experiments. However, we found it a bit harder to make SAC learn successfully and therefore chose DDPG for the publication.

If you do want to try SAC or another algorithm, you'd have to

I made a public copy of our rlsp_sac.py, which depends on stable_baselines, so you have a starting point: https://gist.github.com/stefanbschneider/ebbc1f9f4da273aa8886cf0ab92ec081 But, as I said, we never got SAC to the same performance as our DDPG agent, so no guarantees that it'll work :)

huanghub6224 commented 3 years ago

What a pleasure to receive your reply and to know that someone on the other side of the earth is doing similar research as ourselves.

Now we have rlsp_sac.py, but it still seems to be missing the agent_config YAML files of SAC, since SAC and the DDPG algorithm have different parameter settings, such as learning_starts, train_freq, ent_coef, target_entropy, etc. So we can't use YAML files in res/config/agent/ddpg directly. It would be great if these files could also be shared, Thank you again!

BTW, I'm currently pursuing the Ph.D. degree in artificial intelligence with the Macau University of Science and Technology, Macau, China. If you are interesting, some of my similar researches: [1]Cai J, Huang Z, Liao L, et al. APPM: Adaptive Parallel Processing Mechanism for Service Function Chains[J]. IEEE Transactions on Network and Service Management, 2021, 18(2): 1540-1555. [2]Cai J, Huang Z, Luo J, et al. Composing and deploying parallelized service function chains[J]. Journal of Network and Computer Applications, 2020:102637. https://doi.org/10.1016/j.jnca.2020.102637.

stefanbschneider commented 3 years ago

Nice! Great to hear from a colleague in the same field - on the other side of the world :)

Here's an example configuration for SAC: https://gist.github.com/stefanbschneider/ac7b6e5a3f1c7454623dec900b6fa12f

I have not tested it now but just copied it; you may still need to make some adjustments if you get errors. And even if you don't you might need to check the hyperparemters etc. to get the agent to learn successfully.

Either way, I hope this helps!

stefanbschneider commented 3 years ago

I'm assuming this issue is resolved?

huanghub6224 commented 3 years ago

Yeah, problem solved, we're training the model and compare the results. I really appreciate your help.

stefanbschneider commented 3 years ago

Great to hear. I would really appreciate a citation if/when you publish your work :) https://github.com/RealVNF/DeepCoord#citation

huanghub6224 commented 3 years ago

Of course! It's worth letting more people know about this research.