pdrozdowski / TSPLib.Net

C# .Net wrapper for TSPLib (Travelling Salesman Problems Library collected by Heidelberg university)
Other
62 stars 35 forks source link

Hi i'd be grateful if you could help me #18

Closed sandwizard closed 4 years ago

sandwizard commented 4 years ago

I got an algorithm for tsp which uses a new approach i'd like to test and want to use your project to get the test data.I need help understanding how to extract a 2d array from a tsp instance.Am just noob pls help. if you like to checkout the algorithm its link is below https://github.com/sandwizard/traveling-salesman-console-ver

pdrozdowski commented 4 years ago

Hi, Have no idea what you mean by "how to extract a 2*2 matrix from a tsp instance" ?? Can you give more broad description of what you are trying to do? Regards, Pawel

sandwizard commented 4 years ago

Thanks for the reply ,opps I had a typo,I meant a 2d array. I understand that .tsp instances have graph data but not In 2d array format my question is how i would go about getting the data in that format

On Sat, 8 Aug 2020, 9:18 pm Pawel Drozdowski, notifications@github.com wrote:

Hi, Have no idea what you mean by "how to extract a 2*2 matrix from a tsp instance" ?? Can you give more broad description of what you are trying to do? Regards, Pawel

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pdrozdowski/TSPLib.Net/issues/18#issuecomment-670940626, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIR5YAB4JSIIUJMS5JGEZ33R7VUEJANCNFSM4PXYX7JA .

pdrozdowski commented 4 years ago

Can you give an example of what is has vs what you need?

sandwizard commented 4 years ago

the .tsp files i got from http://www.math.uwaterloo.ca/tsp/world/countries.html with a .tps extension have the following format

NAME : ar9152 COMMENT : 9152 locations in Argentina COMMENT : Derived from National Imagery and Mapping Agency data TYPE : TSP DIMENSION : 9152 EDGE_WEIGHT_TYPE : EUC_2D NODE_COORD_SECTION 1 36266.6667 62550.0000 2 34600.0000 58633.3333 3 51650.0000 72300.0000 4 37800.0000 67683.3333 5 35428.0556 60174.1667 6 34583.3333 68550.0000 .. etc

What i need is a 2d array where the entries are the edge weights between the nodes of their corresponding position in the 2d array., i.e entry at 3,4 is the edge weight between node 3 and node 4 static public long[,] DistanceMatrix = { {0, 2451, 713, 1018, 1631, 1374, 2408, 213, 2571, 875, 1420, 2145, 1972}, {2451, 0, 1745, 1524, 831, 1240, 959, 2596, 403, 1589, 1374, 357, 579}, {713, 1745, 0, 355, 920, 803, 1737, 851, 1858, 262, 940, 1453, 1260}, {1018, 1524, 355, 0, 700, 862, 1395, 1123, 1584, 466, 1056, 1280, 987}, {1631, 831, 920, 700, 0, 663, 1021, 1769, 949, 796, 879, 586, 371}, {1374, 1240, 803, 862, 663, 0, 1681, 1551, 1765, 547, 225, 887, 999}, {2408, 959, 1737, 1395, 1021, 1681, 0, 8000, 678, 1724, 1891, 1114, 701}, {213, 2596, 851, 1123, 1769, 1551, 8000, 0, 2699, 1038, 1605, 2300, 2099}, {2571, 403, 1858, 1584, 949, 1765, 678, 2699, 0, 1744, 1645, 653, 600}, {875, 1589, 262, 466, 796, 547, 1724, 1038, 1744, 0, 679 ,1272, 1162}, {1420, 1374, 940, 1056, 879, 225, 1891, 1605, 1645, 679, 0, 1017, 1200}, {2145, 357, 1453, 1280, 586, 887, 1114, 2300, 653, 1272, 1017, 0, 504}, {1972, 579, 1260, 987, 371, 999, 701, 2099, 600, 1162, 1200, 504, 0}, };

pdrozdowski commented 4 years ago

Hi, The problem with your representation is that it doesn't scale, as the size of the matrix is n^2 of the size of the problem. So for instance in the example you gave problem has : DIMENSION : 9152 EDGE_WEIGHT_TYPE : EUC_2D So your matrix will be like 9152x9152 in size. Problem can run easily into hundreds of thousands of nodes so you will not have even enough memory to store it, not to mention processing. That's why TSP format is storing rather nodes and provides info what type of measure function should be used to calculate the distance. Also in terms of algorithms the trick is to find best path while now knowing all the distances of everything to everything. Think about the world outside, you don't need to know what is closer to Delhi - London or Tokyo to take best route from your home to supermarket and back ;) hoping that helps.

sandwizard commented 4 years ago

i see . i Read here about a 1 dimensional array with the use of smart pointers on this site .

https://acrogenesis.com/or-tools/documentation/user_manual/manual/tsp/tsp.html .

########## 9.3.3. The TSPData class

The TSPData class basically encapsulates a 2-dimensional matrix containing the distances between all nodes. For efficiency reasons, we use a 1-dimensional matrix with a smart pointer defined in the header base/scoped_ptr.h:

private: scopedarray matrix;

To mimic the behaviour of a 2-dimensional matrix, we use:

int64 MatrixIndex(RoutingModel::NodeIndex from, RoutingModel::NodeIndex to) const { return (from * size_ + to).value();}

Notice how we cast the RoutingModel::NodeIndex into an int64 by calling its value() method.

The 1-dimensional matrix is made of the columns of the virtual 2-dimensional matrix placed one after the other.

########### But it seems to be only in C++.and my algorithm Is in c# ,is there any way to do this in c#.Also in my algorithm i use the 2 dimensional matrix only once to create new sorted Dictionaries for each Node .and store them in a static 1D array .would this also cause issues with space.If so how should i approach this in my algorithm.Some Context,In my algorithm the main idea was to have a sorted information on each node and check whether the 2 minimum value edges for each node were the also the same minimum value edge for the corresponding node and if not increase the search range by 1 while preventing redundant checks .Just As you mentioned with this approach i wouldn't need to know all distances from every node to node node but only the first few Distances after the initial sort.for more information about the algorithm check checkout the readme attached or the github link provided before.

Once again thanks for taking the time to help me .

On Sun, Aug 9, 2020 at 2:37 AM Pawel Drozdowski notifications@github.com wrote:

Hi, The problem with your representation is that it doesn't scale, as the size of the matrix is n^2 of the size of the problem. So for instance in the example you gave problem has : DIMENSION : 9152 EDGE_WEIGHT_TYPE : EUC_2D So your matrix will be like 9152x9152 in size. Problem can run easily into hundreds of thousands of nodes so you will not have even enough memory to store it, not to mention processing. That's why TSP format is storing rather nodes and provides info what type of measure function should be used to calculate the distance. Also in terms of algorithms the trick is to find best path while now knowing all the distances of everything to everything. Think about the world outside, you don't need to know what is closer to Delhi - London or Tokyo to take best route from your home to supermarket and back ;) hoping that helps.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pdrozdowski/TSPLib.Net/issues/18#issuecomment-670971398, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIR5YAHYE6S6QVIJ4S2Z4J3R7WZQNANCNFSM4PXYX7JA .

pdrozdowski commented 4 years ago

Hello, so it sounds that in your approach you don't need to store much data at all. Pick node, then compute distance to all other nodes, then take minimum. Next get the corresponding node and do the same. (That's the approach you described). Not sure what to do later... Although your computation complexity will be huge as for each of the n nodes in N space you need to calc distance with every other node twice. On big problem you will get old before it gets calculated. On the other hand caching the sub results of distances you have already calculated gets you back to memory issues described before. To help you out i'd suggest to think of some kind of inference in your algorithm, or some way to split big problems to smaller ones and combine sub results. Also you'd need ask yourself a question if you are about to find the best match - or just do some heuristic, as having 99% accuracy quickly is often better than 100% but with days to have it calculated. Another approach you can use as a benchmark for your is just random approach - you start with a completely random path, then randomly swap 2-nodes in patch each time checking total distance of the path. Caching will work here well, and pretty quickly you will get some decent result. If you lucky you can even get the best path ;) Also is serves good point of comparison how much your solution outperforms it in terms of time, memory and accuracy. Hope that helps.

sandwizard commented 4 years ago

You mentioned calculating the distance between nodes twice .this is where If I had a 2d matrix I wouldn't need to even calculate the distance but just sort every row set and store it in the form of a dictionaries along with its column index signifying the other node and then just fetch the first 2 values for comparison.The problem being for testing large number of nodes i got no clue how to get that 2d matrix from .tsp instance

On Sun, 9 Aug 2020, 12:29 pm Pawel Drozdowski, notifications@github.com wrote:

Hello, so it sounds that in your approach you don't need to store much data at all. Pick node, then compute distance to all other nodes, then take minimum. Next get the corresponding node and do the same. (That's the approach you described). Not sure what to do later... Although your computation complexity will be huge as for each of the n nodes in N space you need to calc distance with every other node twice. On big problem you will get old before it gets calculated. On the other hand caching the sub results of distances you have already calculated gets you back to memory issues described before. To help you out i'd suggest to think of some kind of inference in your algorithm, or some way to split big problems to smaller ones and combine sub results. Also you'd need ask yourself a question if you are about to find the best match - or just do some heuristic, as having 99% accuracy quickly is often better than 100% but with days to have it calculated. Another approach you can use as a benchmark for your is just random approach - you start with a completely random path, then randomly swap 2-nodes in patch each time checking total distance of the path. Caching will work here well, and pretty quickly you will get some decent result. If you lucky you can even get the best path ;) Also is serves good point of comparison how much your solution outperforms it in terms of time, memory and accuracy. Hope that helps.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pdrozdowski/TSPLib.Net/issues/18#issuecomment-671013354, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIR5YABAP4ZFWX3DXNGQMXDR7Y647ANCNFSM4PXYX7JA .

sandwizard commented 4 years ago

Since I use linq to sort. I read that worst case is n^2 the complexity for setting things up before finding the path should be n^2×n since I only sort and store the row sets along with their indexes .In the algorithm in the file named complete graph.cs where I have a function called findhamiltonycle() which compares the first 2 min values edges and checks whether the corresponding node also has the same edge as a minimum value edge if so the edge is selected and count of no of edges found is increased by one this is run till 2 edges are found.Here the function is conditionaly recursive and checking the time complexity is beyond me thus I need to test it on large sets of data

On Sun, 9 Aug 2020, 12:47 pm tenzin jamtsho, 10zinjts@gmail.com wrote:

You mentioned calculating the distance between nodes twice .this is where If I had a 2d matrix I wouldn't need to even calculate the distance but just sort every row set and store it in the form of a dictionaries along with its column index signifying the other node and then just fetch the first 2 values for comparison.The problem being for testing large number of nodes i got no clue how to get that 2d matrix from .tsp instance

On Sun, 9 Aug 2020, 12:29 pm Pawel Drozdowski, notifications@github.com wrote:

Hello, so it sounds that in your approach you don't need to store much data at all. Pick node, then compute distance to all other nodes, then take minimum. Next get the corresponding node and do the same. (That's the approach you described). Not sure what to do later... Although your computation complexity will be huge as for each of the n nodes in N space you need to calc distance with every other node twice. On big problem you will get old before it gets calculated. On the other hand caching the sub results of distances you have already calculated gets you back to memory issues described before. To help you out i'd suggest to think of some kind of inference in your algorithm, or some way to split big problems to smaller ones and combine sub results. Also you'd need ask yourself a question if you are about to find the best match - or just do some heuristic, as having 99% accuracy quickly is often better than 100% but with days to have it calculated. Another approach you can use as a benchmark for your is just random approach - you start with a completely random path, then randomly swap 2-nodes in patch each time checking total distance of the path. Caching will work here well, and pretty quickly you will get some decent result. If you lucky you can even get the best path ;) Also is serves good point of comparison how much your solution outperforms it in terms of time, memory and accuracy. Hope that helps.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pdrozdowski/TSPLib.Net/issues/18#issuecomment-671013354, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIR5YABAP4ZFWX3DXNGQMXDR7Y647ANCNFSM4PXYX7JA .

pdrozdowski commented 4 years ago

Your 2d matrix has the distances calculated. To get it you need to (pseudo code): Foreach nodeA in nodes Foreach nodeB in nodes Matrix[a,b] = calc distance (a,b) So you see, you need to do calcs, and you need to declare nxn matrix in memory to store it. So you'll face all the mentioned challenges.

sandwizard commented 4 years ago

Ha ha yes I see my bad.Can you explain how exactly .tsp files store the data

On Sun, 9 Aug 2020, 1:21 pm Pawel Drozdowski, notifications@github.com wrote:

Your 2d matrix has the distances calculated. To get it you need to (pseudo code): Foreach nodeA in nodes Foreach nodeB in nodes Matrix[a,b] = calc distance (a,b) So you see, you need to do calcs, and you need to declare nxn matrix in memory to store it. So you'll face all the mentioned challenges.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pdrozdowski/TSPLib.Net/issues/18#issuecomment-671017644, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIR5YAFSTSWMZFE2YFVTHOTR7ZFABANCNFSM4PXYX7JA .

sandwizard commented 4 years ago

Also how bad is n^3 time complexity

On Sun, 9 Aug 2020, 1:26 pm tenzin jamtsho, 10zinjts@gmail.com wrote:

Ha ha yes I see my bad.Can you explain how exactly .tsp files store the data

On Sun, 9 Aug 2020, 1:21 pm Pawel Drozdowski, notifications@github.com wrote:

Your 2d matrix has the distances calculated. To get it you need to (pseudo code): Foreach nodeA in nodes Foreach nodeB in nodes Matrix[a,b] = calc distance (a,b) So you see, you need to do calcs, and you need to declare nxn matrix in memory to store it. So you'll face all the mentioned challenges.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pdrozdowski/TSPLib.Net/issues/18#issuecomment-671017644, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIR5YAFSTSWMZFE2YFVTHOTR7ZFABANCNFSM4PXYX7JA .

sandwizard commented 4 years ago

Would it be okay to call you on discord and talk about this this back and forth is making it hard .

On Sun, 9 Aug 2020, 1:47 pm tenzin jamtsho, 10zinjts@gmail.com wrote:

Also how bad is n^3 time complexity

On Sun, 9 Aug 2020, 1:26 pm tenzin jamtsho, 10zinjts@gmail.com wrote:

Ha ha yes I see my bad.Can you explain how exactly .tsp files store the data

On Sun, 9 Aug 2020, 1:21 pm Pawel Drozdowski, notifications@github.com wrote:

Your 2d matrix has the distances calculated. To get it you need to (pseudo code): Foreach nodeA in nodes Foreach nodeB in nodes Matrix[a,b] = calc distance (a,b) So you see, you need to do calcs, and you need to declare nxn matrix in memory to store it. So you'll face all the mentioned challenges.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pdrozdowski/TSPLib.Net/issues/18#issuecomment-671017644, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIR5YAFSTSWMZFE2YFVTHOTR7ZFABANCNFSM4PXYX7JA .

pdrozdowski commented 4 years ago

Hi, I think the file format is pretty obvious, but here are details.

NAME : ar9152 <----- name of the file COMMENT : 9152 locations in Argentina <---- whatever author wanted to comment on the file COMMENT : Derived from National Imagery and Mapping Agency data <---- whatever author wanted to comment on the file TYPE : TSP <----- type of the format as there are few versions of it DIMENSION : 9152 <----- how many nodes problem has EDGE_WEIGHT_TYPE : EUC_2D <------ what tyoe of distance method applies to this problem NODE_COORD_SECTION <---- here start secion with nodes 1 36266.6667 62550.0000 <---- order number, X coor, y coord 2 34600.0000 58633.3333

you will see also versions with 3D, with different distance function, and even there is one format with just edges but for smaller problems.

Closing this thread.