Learning-based approaches have the potential to yield more effective empirical algorithms for NP-hard problems by learning from large datasets
[ ] He et al. [19] learned a node selection policy for branch-and-bound algorithms with imitation learning.
[ ] Silver et al. [34, 35] used reinforcement learning to learn strategies for the game Go that achieved unprecedented results.
[ ] Vinyals et al. [37] developed a new neural network architecture called a pointer network, and applied it to small-scale planar Travelling Salesman Problem (TSP) instances with up to 50
nodes.
[ ] Bello et al. [6] used reinforcement learning to train pointer networks to generate solutions
for synthetic planar TSP instances with up to 100 nodes, and also
Use supervised-learning with Graph Convolutional Network instead of reinforcement learning, which is known as a particularly challenging optimization problem
Workflow
First, the input graph is reduced to an equivalent smaller graph. Then it is fed into the graph convolutional network f, which generates multiple probability maps that encode the likelihood of each vertex being in the optimal solution. The probability maps are used to
iteratively label the vertices until all vertices are labelled. A complete labelling corresponds to a leaf in the search tree. Internal nodes in the search tree represent incomplete labellings that are generated along the way. The complete labellings generated by the tree search are refined by rapid local search.
The best result is used as the final output.
2018 NIPS
[PDF]
Workflow