Shadab442 / dqn-leo-handover-python

DQN based Handover Optimization for LEO Satellites in NTN
20 stars 4 forks source link

question about the code #4

Closed hui42376 closed 6 months ago

hui42376 commented 10 months ago

I have a question: In the LeoEnv class :compute_reward def , what does" elif torch.all(torch.eq(action, self.previous_action)):reward = 25.0" mean? if the previous action equals the action, the reward is 25.0, does this code mean: if there isn't a handover, the current selected satellite is the previous satellite, and the reward is 25.0. But the current action equals the previous action doesn't mean the current selected satellite is the previous satellite, because the satellite index changes with timestamps, the same action index may correspond to different satellites. I hope you can take the time to answer my questions, thank you very much.

Shadab442 commented 10 months ago

Great question. In this setup, the current action equals the previous action means the current selected satellite is the previous satellite because the satellite index does not change with timestamps. I selected 10 satellites at the beginning of the program (init function of LeoEnv class) which equals the number of candidate satellites defined later (compute_state function of LeoEnv class). However, if you use more than 10 satellites, the problem discussed by you will arise. This was a very simple demo problem I wanted to demonstrate in my course project, that is why, it does not cover all the possibilities. In case you want to choose a different number of satellites, you should use a mapping between satellite ID and action ID before computing the states to interpret actions later.

Shadab442 commented 10 months ago

On second thought, I saw that I have sorted the values in the same function (l = np.sort(l)) which may create the issue you mentioned. I think you can comment out that sorting part, that should suffice this issue.

hui42376 commented 10 months ago

Thank you for your answer. I seem to understand what the problem is. Thank you.

Shadab442 commented 6 months ago

Glad to be able to help you!!