yangalan123 / anhp-andtt

Codebase for Attentive Neural Hawkes Process (A-NHP) and Attentive Neural Datalog Through Time (A-NDTT)
MIT License
55 stars 10 forks source link

Reproducing the prediction results using the thinning algoritm. #3

Closed JardinDelSol closed 1 year ago

JardinDelSol commented 2 years ago

Hi Alan,

Thank you for your response.

I quite enjoyed the paper you recommended. I believe that I have some level of understaning about the thinning algorithm, but still having a hard time understanding the mathmetical details behind it.

Also, It got me wondering about the difference between the method used in Neural Hawkes Process where you directly approximate the conditional expectation of t_i and the thinnin algorithm approach. What would be the benefit of using the thinning algorithm?

Again, thank you for sharing a great work!

yangalan123 commented 2 years ago

I believe that I have some level of understaning about the thinning algorithm, but still having a hard time understanding the mathmetical details behind it.

I understand that getting the full picture of the thinning algorithm can be hard. Can you summarize what still confuses you?

It got me wondering about the difference between the method used in Neural Hawkes Process where you directly approximate the conditional expectation of t_i and the thinnin algorithm approach. What would be the benefit of using the thinning algorithm?

Sorry, I am afraid that I do not understand what you mean by "directly approximate the conditional expectation" here -- what approximations do you want to use that are different from Monte-Carlo sampling (as in the thinning algorithm?) Do you mean using some prediction head to predict? Then as I explained in #2 , this is not the correct way to evaluate a generative model. Also, regarding why we want to use the thinning algorithm for the prediction, you can take a look at Section 4 and Section 6.4 of the NHP paper. The short answer is that using the (modified) thinning algorithm (in the NHP paper) provides a flexible (so you do not need to worry too much about the model parameterization) and efficient way to obtain minimum Bayes risk predictions.

Let me know if anything is unclear.

Other recommended reading materials regarding how to make predictions in Temporal Point Process and some discussions around the pros and cons for each: check out section 2.2.7 of the Ph.D. thesis written by Oleksandr Shchur

yangalan123 commented 1 year ago

Closed this issue as no further response is observed. Feel free to re-open it if you still feel confused.