AnugyaSahu / Combinatorial-problems-with-Transformers

Solving complex real-world COPs with limited data / information and deep learning
Apache License 2.0
0 stars 0 forks source link

Investigate TSP and VRP #2

Closed riridi closed 1 year ago

riridi commented 1 year ago

Currently, the TSP and VRP (Vehicle Routing) problems seem interesting. Please take a look at the existing datasets to clarify the following points: are examples generated randomly or is there a fixed number of points (i.e. table with cities), are solutions generated or fixed, how complex are the problems (input/output size), how easy can they be used?

AnugyaSahu commented 1 year ago

1) TSPLIB, OR LIBRARY TSP, SOLOMONS VRP BENCHMARK – TSP datasets - http://mistic.heig-vd.ch/taillard/problemes.dir/tsp/tsp.html
http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/ VRP datasets - http://www.raucq.be/ http://people.brunel.ac.uk/~mastjjb/jeb/orlib/areainfo.html

TSP -

-- Real world datasets or collected through a process not random, in form of coordinates -- Some datasets are with different number of cities , the biggest dataset with consistent number of cities and optimally ordered instances is TSP50 and TSP100 datasets split up into training and test datasets. -- The above datasets have already an optimal solution because they are ordered in the dataset , when we use the dataset we have to shuffle the various instances -- Heuristic methods can also be used to compare to the given optimal solutions -- Dynamic programming method takes too much time but other heuristic methods and deep learning methods tackle the complexity of the dataset , very easy to use since the given dataset is (50, 2) or (100, 2) as an instance in form of coordinates and input becomes easy to manage, output is a sequence of optimally ordered cities in size (50, 1) and optimal cost as a floating number which uses Euclidean distance

VRP -

-- VRP datasets are much more complex with atleast 5 different tables or data frames for depots, vehicles, customer constraints , costs and time. No coordinates are given, even the costs are not in form of matrices -- Datasets need a lot of preprocessing to be ready, for e.g. one hot encoding, joining and merging of tables -- Not easy to use and output size is sequential for each depot or vehicle -- No optimal solutions are given