zcaicaros / manager-worker-mtsptwr

Official implementation of paper "Learning to Solve Multiple-TSP with Time Window and Rejections via Deep Reinforcement Learning"
35 stars 6 forks source link

about 'rejection rate' #4

Open gaofusiji opened 5 months ago

gaofusiji commented 5 months ago

Hello, sorry to bother you. While reading the code, I noticed that the rejection rate is obtained through the function test_learned_model in policy.py. However, it seems that test_learned_model directly returns the rejection rate without performing the calculation mentioned in the paper, which involves dividing the number of rejected points by the total number of points. Could you please kindly point me in the right direction and let me know where this calculation process is addressed in the code?

zcaicaros commented 5 months ago
cost, _, rej, length, rej_count = model(data, beta=beta)

Referring to the above code (line 122 policy.py), we directly call the worker model, which will return the rejection rate. That is, given a subroute (data here), the rejection rate for this subroute will be the number of rejected nodes divided by the total number of this subroute. For example, if there are 10 nodes in data, five of which are rejected, then the rejection rate for this sub-route is 5/10=0.5.

The pitty is that the code for our worker model is lost (the first author did the research for the worker model; I requested him to publish the code, but he failed to do so). But the logic is the above: The manager agent divided the total nodes into clusters, and for each cluster, we call our worker model to solve it and return the rejection rate.

Here, I paste the research for the worker model, which is basically based on Attention! Learning to Solve Routing Problems by modifying the reward to accommodate the rejection rate.

[ref1 worker agent paper] https://ieeexplore.ieee.org/abstract/document/9207026/ [ref2 AM for solving routing problem] https://arxiv.org/abs/1803.08475