Question regarding Monte Carlo Tree Search

slyphnford commented 3 years ago

Good Day, about the MCTS implementation, I am actually hard stuck and do not know what to do (even with the pseudocode provided).

Question 1 : Do you have any suggestions to find any references for this implementation? I even search online and there is only calculation and it is very hard for me to draft it out.

Question 2 : When I look through the pseudocode, what exactly is policy and global policy? I'm actually confused.

Question 3: How do actually pass the best_tour and new_tour? There is no parameter passing or reference for them, or do I have to add to the parameter?

Thank you for taking your time to answer our questions!

NeverOnTimeSdnBhd commented 3 years ago

Hi

Question 1 and 2: The pseudocode given is actually Monte Carlo Tree Search with Nested Rollout Policy Adaptation (MCTS with NRPA). It is a variant of MCTS. It will be a little bit different from what is explained in the question. I will try to find time to upload more resources and explanations about that (hopefully I can find the time and make it ASAP). Maybe while you are waiting me update the resources you can try Google search to have an overview of what it is first.

Please go to this directory for more information: https://github.com/NeverOnTimeSdnBhd/Delivery-Instances/tree/main/mcts I will update again to make it more readable and easier to understand.

Question 3: If you notice line 16 I write initialize best_tour with positive infinity cost and line 22 there is new_tour = search (level-1, iteration). Here are the initializations of best_tour and new_tour. The tour object here is just an object stores information about a tour (for example: tour total cost, every vehicle route, and etc. it might be vary depend on you).

slyphnford commented 3 years ago

Good Day, about MCTS implementation,

I'm actually confused and I hope we can get more resources and these few days I try to search on Google and there is none talking about MCTS with Nested Rollout Policy Adaptation(only one documentation that is similar to your explanation.)

I don't think I can understand and write the code just according to your pseudocode and I can't create an image of what exactly should I do although I tried to read through what you answer the last three questions. It will be good if you can provide us the sample input and output like the whole process.

All I can find from the internet are the implementation of Monte Carlo Tree Search using Python and if using Java are solving non-vehicle routings problems like tic tac toe and etc. I have gone through a lot of video and post and I still cannot understand what I should do to solve this problem

Thank you for taking your time to read my post.

NeverOnTimeSdnBhd commented 3 years ago

Hi

Sorry for late reply because I just done compiling the information. Please check here for explanation on MCTS with NRPA: MCTS with NRPA Explanation Please check again the pseudocode as I added the input and output explanation for all functions: MCTS with NRPA pseudocode

**The only method you should call to start searching is the search method, other methods are all used in the search method or other methods used in the search method.

Should you have any more question can raise your issue here again. My suggestion is just try follow the pseudocode without fully understand what it do first. No need to understand fully on the concept as it takes time.

slyphnford commented 3 years ago

Good Day, about MCTS implementation,

During the rollout function, (line 76) What are the rules that the current Route does not violate? Can you explain further?

NeverOnTimeSdnBhd commented 3 years ago

Hi

In basic requirement, the only rule that it may violate is capacity, aka. adding a new stop to the current vehicle exceeds the vehicle capacity.

However, if you choose to add some other rules stated in extra requirements, such as time window (violation: a stop added to vehicle is reached after its time window) then you have to take care of those too.

slyphnford commented 3 years ago

Good Day, about MCTS implementation, During the rollout function, (line 78 and line 80), can u explain further about what exactly is check and visit? (give a situation)

Thank you for spending time reading and answer my question as always

slyphnford commented 3 years ago

Good Day, about MCTS implementation, During the search function (line 29), I already done calling new_tour function with search function "new_tour = search (level-1, iteration)" when it goes to the next line and it still says new_tour it's null.

Sorry for keep interrupting you and I don't know why my new_tour is still null.

Thank you for spending time reading and answer my question as always

NeverOnTimeSdnBhd commented 3 years ago

Hi

Given a situation as below: We are currently finding next move for a route (for a vehicle). Assuming current route is 0 -> 3 -> 1 -> and the remaining capacity of this vehicle is 3. Let say there are 2 possible next moves from stop 1 (the last stop in current route), 2 and 4, stop 2 is having demand size of 4 while stop 4 is having demand size of 2. If the first selected next move is stop 2, it will found that adding stop 2 to current route is violating the capacity rule, thus marking this stop/node as checked, meaning the stop/node is explored but not yet added to the route (as it is not suitable for this route). After this it will then select stop 4 as next stop and find out adding this stop/node to current route is not violating capacity rule thus we can safely add stop 4 to current route and mark this stop/node as visited, meaning the stop/node is explored and added to the route.

So in short, checked means explored but not yet added to any route of the tour while visited means explored and is added into one of the route in the tour (so a visited stop/node shouldn't be explored anymore when searching for upcoming routes).

NeverOnTimeSdnBhd commented 3 years ago

Good Day, about MCTS implementation, During the search function (line 29), I already done calling new_tour function with search function "new_tour = search (level-1, iteration)" when it goes to the next line and it still says new_tour it's null.

Sorry for keep interrupting you and I don't know why my new_tour is still null.

Thank you for spending time reading and answer my question as always

Did you return a tour object for the search method in line 37? One more thing, new_tour isn't a method, it should be an object.

slyphnford commented 3 years ago

Yes we do return best_tour at line 37.

NeverOnTimeSdnBhd commented 3 years ago

If so, try check if you initialize your best_tour variable correctly as in line 23.

slyphnford commented 3 years ago

Good Day, about MCTS implementation, I wrote all the codes according to ur pseudocode. But I didn't get my desire output. It would be nice if you can take a look at my logic and help me spot where I wrong and I can fix it asap. I've actually stuck especially the part where I cannot direct to the second route (when it is full).

Thank you for being so responsive!

NeverOnTimeSdnBhd commented 3 years ago

Hi

Sorry I am currently not available so I might respond to you later in the night.

I had saved your flow chart and I think you can remove it from your comment first as it is part of your effort/answer for the assignment so that other people won't simply take the picture and use it in their assignment without informing you😉

NeverOnTimeSdnBhd commented 3 years ago

Hi

Here are the parts I found that the logic is not that correct, but I am not sure whether it is you draw it wrongly only (but code it correctly) or you code it wrongly too.

Regarding your problem about couldn't proceed to second route, be aware of your checked information. checked is to inform the searching that certain nodes are explored and found useless so we shouldn't explored again in this route. However, you need to aware that when you proceed to searching in next route, those checked node should be getting unchecked (clear the check marking) so that the searching know it should explore those node (as those checked nodes are not yet added to any route in the tour). This is one of the reason I could think of for your problem. Hope it helps.

My pseudocode didn't write 100% things inside, some information like the logic parts (as what mention above about checked part) which are not so related to MCTS concepts are omitted, so be careful when you do the coding when referring to the pseudocode.

Ps. Once you read this message please download the image and inform me so that I can delete it.

slyphnford commented 3 years ago

feel free to delete the images, thank you very much and i will try to implement later. As always thank you for spending ur time answering my questions.

slyphnford commented 3 years ago

Good Day, about MCTS implementation, I stuck at the last part where I need to define my best cost

This happens when I'm trying to do the last part where it suddenly defines 0 as the best cost (I don't know where the 0.0 come from and I swear I didn't print anything from my code (I even check my whole code to find where it prints from). I actually feel like my adapt function is wrong and it might cause some problem like this. This messes up my final output where I'm trying to display the best cost.

Another source of error might be doing the iteration will have empty trees? If empty trees might cause no flow and it will have 0 costs ( this might be another problem)

The last question is for the adapt part, I really don't understand the pseudocode and hope u can give some examples or references for it. (I really don't know how to code it out that part. For now I just simply code that part according to my senses. As always thank you for spending ur time answering my questions.

slyphnford commented 3 years ago

Today morning I implement without hardcode the input: where it still has the same problem. The image above is an extra reference for the problem I started yesterday night.

As always thank you for spending ur time answering my questions.

NeverOnTimeSdnBhd commented 3 years ago

Hi

It is a bit hard for me to figure out where is the mistake you made this way so I couldn't give any suggestions like this... For the adapt function, I think the pseudocode is quite straight forward, let me clear up some variables that might make you confuse:

visited is a local variable, meaning it don't have any relation with the visited variable in other functions, this variable is just to check which node is visited when we do the calculation as you can see in line 58 we set a stop to visited after we make some calculation based on that stop.
possible move that can made by stop means every move that can be made from current stop. For example, we have N = 5, then for stop 1, possible moves are all stop excluding itself (aka. stop 0, 2, 3, 4), it does not related to the tour you search of and demand size.

Hope this can help you code and debug the adapt function better.

slyphnford commented 3 years ago

Good Day, for the MCTS implementation, Can you provide us some of your output? from the instances u gave. Exp : (n6-c29.txt) output. I wanna test the best tour it returns and see whether it is the same as the desired output. If can give more about other files it is the best.

As always thank you for spending ur time answering my questions.

NeverOnTimeSdnBhd commented 3 years ago

Yep sure,

Actually I planned to do that for a while but I am just worry many of you too 'care' about getting the exact same answer as my sample answer.

Be aware that you might get different result for Basic Simulation and MCTS Simulation for large N sample depend on your resources (computing power)

Anyway, I will create a new folder that contains sample answers ASAP

NeverOnTimeSdnBhd / Delivery-Instances

Question regarding Monte Carlo Tree Search #2