Haoran-Peng / UAV-RIS_EnergyHarvesting

61 stars 12 forks source link

Thank you for sharing the code and would like to consult the code to report an error. #1

Closed geke520 closed 1 year ago

geke520 commented 1 year ago

Thank you very much for answering my question in your busy schedule. In the process of running the code Stable-Baselines3 source code (DDPG.py), it always says that "info" is set without "get" method. What is the solution to this problem, please? I look forward to hearing from you! image

Haoran-Peng commented 1 year ago

Thank you for your interest in my work. In my opinion, this issue is caused by the environment and the version of stable-baseline3. My project was built on an old version of the stable-baseline3. The DDPG and TD3 were used as baselines of my research. Therefore, I didn't modify any part of the DDPG and TD3 of the stable-baseline3 library. However, your problem is caused by the function in the Stable-Baseline3. I will try to adjust this project with the latest version of Stable-baseline3.

geke520 commented 1 year ago

Thank you very much for your answer, I look forward to your new version of the code, and I will try to make changes myself.

Haoran-Peng commented 1 year ago

Dear Geke,

For DDPG and TD3, please make sure the training mode in the "Foo_env.py" is corrected.

class FooEnv(gym.Env):
    metadata = {'render.modes': ['human']}
    def __init__(self, LoadData = True, Train = False, multiUT = True, Trajectory_mode = 'Fermat', MaxStep = 41):

The "Train=False" and "Train=True" for the testing phase and the training phase, respectively.

For SD3 and exhaustive search, please use version 0.15.3 for Gym.

Thank you for your interest in my study.

Best, Haoran

geke520 commented 1 year ago

Dear Haoran,

Thank you for your patience, this is the first complete code for reinforcement learning that I have learned, you have written a very good I will understand carefully.

Best wishes, Geke

geke520 commented 1 year ago

Dear Haoran,

Sorry to bother you again, I would like to ask you some questions about the DDPG, TD3 and SD3 algorithms. 1.TD3 also has "deterministic", but why is the actor network updated in a similar way to the PG update with a Gaussian noise. DDPG: image TD3: image

2.SD3 uses the softmax method from an algorithmic point of view not only in TD3, what is the reason for the delayed update that is not used in the SD3 algorithm?i.e. SD3 Algorithm below doesn't have "if t mod d then” image

I would be grateful if you could answer these questions.

Best wishes, Geke

Haoran-Peng commented 1 year ago
  1. TD3 is still an RL algorithm. Therefore, it is a common noise for the exploration of the action space in reinforcement learning. It can be found at https://spinningup.openai.com/en/latest/algorithms/td3.html and https://arxiv.org/pdf/1802.09477.pdf
  2. From my personal perspective, the author of the original SD3 used the softmax operation to reduce the bias in each step and cause less underestimation. The principle of the SD3 can be found in its original paper: https://dl.acm.org/doi/pdf/10.5555/3495724.3496711
geke520 commented 1 year ago

Thank you for your patient answer.

geke520 commented 1 year ago

Dear Haoran,

My field of study is related to yours, can I have your contact information for further academic discussion? I would be very grateful if you could reply.

Best wishes, Geke

Haoran-Peng commented 1 year ago

Sure. WeChat or what’s App?

geke @.***>于2023年6月17日 周六21:34写道:

Dear Haoran,

My field of study is related to yours, can I have your contact information for further academic discussion? I would be very grateful if you could reply.

Best wishes, Geke

— Reply to this email directly, view it on GitHub https://github.com/Haoran-Peng/UAV-RIS_EnergyHarvesting/issues/1#issuecomment-1595762398, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKPBOTAE2CTOX2QWITSKC2LXLWW5XANCNFSM6AAAAAAWRKJKWU . You are receiving this because you commented.Message ID: @.***>

geke520 commented 1 year ago

WeChat is okay,thanks.

Haoran-Peng commented 1 year ago

Please send a email to me.

geke @.***>于2023年6月17日 周六22:38写道:

WeChat is okay,thanks.

— Reply to this email directly, view it on GitHub https://github.com/Haoran-Peng/UAV-RIS_EnergyHarvesting/issues/1#issuecomment-1595776413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKPBOTAXSLVI57ATMYUOCFDXLW6N5ANCNFSM6AAAAAAWRKJKWU . You are receiving this because you commented.Message ID: @.***>