MARIO-Math-Reasoning / Super_MARIO

MIT License
254 stars 16 forks source link

About the code #3

Closed liushz closed 6 months ago

liushz commented 6 months ago

Thanks for your great job for math reasoning! I am just doing some research about this domain, and I wonder when will you release your code? I'm thinking of reproducing your paper by myself if you will not publish it soon (like in a week).

lovecambi commented 6 months ago

Thanks for your great job for math reasoning! I am just doing some research about this domain, and I wonder when will you release your code? I'm thinking of reproducing your paper by myself if you will not publish it soon (like in a week).

Thanks. The code will be released in one or two days.

billxbf commented 6 months ago

I saw you've uploaded some decoding code for inference. Thanks! I wonder if you plan to release training code, like loss implementation, etc. in the future?

lovecambi commented 6 months ago

I saw you've uploaded some decoding code for inference. Thanks! I wonder if you plan to release training code, like loss implementation, etc. in the future?

We just release the mcts inference code, which is the core part to construct the training data. For the loss, we just used LLM's auto-regressive loss + 0.1 * value loss, where value loss is simply MSE.

Due to corporate policy, we currently cannot release the training code.