facebookresearch / mvfst-rl

An asynchronous RL platform for congestion control in QUIC transport protocol. https://arxiv.org/abs/1910.04054.
Other
154 stars 34 forks source link

Questions about input #22

Closed chengcheng8632 closed 4 years ago

chengcheng8632 commented 4 years ago

Dear author: I'm sorry to disturb you again. In the paper, our input is a 21-dimensional state space, and the value of reward is not used as an input. But in polybeast.py in the train folder, "core_input = torch.cat ([x, clipped_reward], dim = -1)", it seems that the reward is used as input. I ask is that right? What is the purpose of this? Looking forward to your answer. Thank you very much?

odelalleau commented 4 years ago

Hi @chengcheng8632, this may indeed not be entirely clear from Appendix A.1 in the paper, but it is mentioned in the "Model" paragraph of Section 4: "The extracted features and the reward are fed into a single-layer LSTM"

Edit: regarding the purpose, it is an extra signal that a recurrent model may take advantage of -- I am not sure if experiments have been run to see whether or not this improved performance in this particular application