Closed JardinDelSol closed 1 year ago
I believe that I have some level of understaning about the thinning algorithm, but still having a hard time understanding the mathmetical details behind it.
I understand that getting the full picture of the thinning algorithm can be hard. Can you summarize what still confuses you?
It got me wondering about the difference between the method used in Neural Hawkes Process where you directly approximate the conditional expectation of t_i and the thinnin algorithm approach. What would be the benefit of using the thinning algorithm?
Sorry, I am afraid that I do not understand what you mean by "directly approximate the conditional expectation" here -- what approximations do you want to use that are different from Monte-Carlo sampling (as in the thinning algorithm?) Do you mean using some prediction head to predict? Then as I explained in #2 , this is not the correct way to evaluate a generative model. Also, regarding why we want to use the thinning algorithm for the prediction, you can take a look at Section 4 and Section 6.4 of the NHP paper. The short answer is that using the (modified) thinning algorithm (in the NHP paper) provides a flexible (so you do not need to worry too much about the model parameterization) and efficient way to obtain minimum Bayes risk predictions.
Let me know if anything is unclear.
Other recommended reading materials regarding how to make predictions in Temporal Point Process and some discussions around the pros and cons for each: check out section 2.2.7 of the Ph.D. thesis written by Oleksandr Shchur
Closed this issue as no further response is observed. Feel free to re-open it if you still feel confused.
Hi Alan,
Thank you for your response.
I quite enjoyed the paper you recommended. I believe that I have some level of understaning about the thinning algorithm, but still having a hard time understanding the mathmetical details behind it.
Also, It got me wondering about the difference between the method used in Neural Hawkes Process where you directly approximate the conditional expectation of t_i and the thinnin algorithm approach. What would be the benefit of using the thinning algorithm?
Again, thank you for sharing a great work!