Open baichen99 opened 1 year ago
Hi friend! Thanks for your interest in our work. Regarding to your question that policy gradient does not help decrease in the controller loss, there are a few suggestions I can offer: First, it would be advisable to ensure that your code is written correctly. For example, you can verify if the tree structure is accurately represented and finetune the tree's parameters using a combination of PDE loss and Boundary loss to see if it can achieve a very small error. Then, you might want to consider tuning the hyperparameters of the controller, such as the learning rate and batch size. Let me know if you have any other question and hope it can help address your concern.
Hi, thank you for your guidance. I'll follow your advice to verify my code and adjust the tree parameters using PDE and Boundary loss. I'll share the results post-implementation. For any further issues, I'll reach out again. Thanks again for your help.
Thank you for your great work!
I refactored the code repo is here, but it seems that the use of policy gradient during the searching phase did not result in a decrease in the controller's loss.Do you have any suggestions for me in this regard?