Question about the continuous likelihood.

Leeseonggye commented 11 months ago

Hello. I am a master degree student at Korea university.

First of all, I really appreciate to give me a good inspiration from your interesting paper "Large Language Models Are Zero-Shot Time Series Forecasters". And also a big congratulations on being published in NeurIPS 2023!

I read a lot of time, but I can't understand the part of "continuous likelihood".

First thing is the part of p(u_1, ..., u_n) = p(u_n | u_n-1, .. u_0) p(u_1 | u_0) p(u_0) It is related to hierarchical softmax, but I can't understand 100%. If this part means the definition of general language model, it should be p(u_1, ..., u_n) = p(u_n | u_n-1, .. u_0) ... p(u_1 | u_0) * p(u_0).

Second thing is part of the definition of U_k(x). I think U_k(x) should be just composed of an indicator function. I can't understand the reason for the B^n term in the part of the definition.

Thnak you.

Leeseonggye commented 11 months ago

One more question is about the prediction horizon in Darts and Monash archive. I can't find the prediction horizon in Darts and Monash archive. Can I think of the prediction horizon as 1?

mfinzi commented 9 months ago

hi @Leeseonggye , sorry for the late response. For n bits of precision in base B, the width of the uniform distribution U_k is 1/B^n. The likelihood of the uniform for U_k integrates to 1 over this region (of width 1/B^n) therefore the likelihood is B^n

ngruver / llmtime

Question about the continuous likelihood. #17