a question about mask rule

mengchengTang / DRL-Energy-optimal-Routing-for-Electric-Vehicles

Energy-optimal Routing for Electric Vehicles using Deep Reinforcement Learning with Transformer

14 stars 1 forks source link

a question about mask rule #3

Closed mumuorMUMU closed 3 months ago

mumuorMUMU commented 3 months ago

Excuse me, I'd like to ask if there will be a situation where all actions are not feasible according to the mask rule mentioned on this paper. Looking forward to your answers! Thanks!

mengchengTang commented 3 months ago

When all actions are infeasible, the depot node will be opened. And when all instances within a batch can only choose the depot, the round ends.

mumuorMUMU commented 3 months ago

Thank you for your reply. Based on your answer, can it be understood that this round of exploration failed? So will this happen during the inference phase of the model? Looking forward to your reply！

mengchengTang commented 3 months ago

EVRP instances are meticulously designed to prevent exploration failures, meaning situations where all points are masked do not occur. In my previous response, vehicles only choose the depot because all customer points have been serviced, and there are other instances within the same batch that have not completed exploration. Thus, this instance waits for other instances to finish exploring by continuously selecting the depot node.

mumuorMUMU commented 3 months ago

You answered my doubts. Thank you very much for your reply！