SimiaoZuo / Transformer-Hawkes-Process

Code for Transformer Hawkes Process, ICML 2020.
MIT License
174 stars 48 forks source link

Ambiguity in calculating log likelihood #14

Open hojjatkarami opened 1 year ago

hojjatkarami commented 1 year ago

Hi, I have the following issue regarding calculating LL in Utils.py in line 49 inside the function "compute_integral_unbiased":

temp_hid = torch.sum(temp_hid * type_mask[:, 1:, :], dim=2, keepdim=True)

you have only considered the occurred events for calculating integral while according to the formula we should compute the integral of each event type. I would expect the output dimension of this function to be [Batch_size, Length, Num_types]. then we should sum over all num_types instead of reducing it to only occurred events.

I believe that this underestimation of this integral has led to your high overall LL compared to other studies.

looking forward to your clarifications

waystogetthere commented 1 year ago

I understand the issue and inspect the code. My mentor also notifies this error.

The Monte Carlo method:

$\sum_{j \in [2,3,...N]} (tj - t{j-1}) \sum_{i \in [1,2,...M]} \lambda(u_i) $

where N is the number of total events, M is the number of sampling points between every 2 adjacent events, a hyper-parameter.

The code only computes the intensity at the occurring type, while ignoring others.

Looks like what the code does is: $\sum_{j \in [2,3,...N]} (tj - t{j-1}) \sum{i \in [1,2,...M]} \lambda{k_j}(u_i) $ where $k_j$ is the type of the event occurs at $t_j$.

However, the correct one should be

$\sum_{j \in [2,3,...N]} (tj - t{j-1}) \sum{k \in [1,2,...K]} \sum{i \in [1,2,...M]} \lambda_k(u_i) $

where K is the total number of event types, and $\lambda_k$ is the corresponding intensity at $u_i$ for type k

hojjatkarami commented 1 year ago

thank you for the response.

If you correct this error, then you'd better report the NLL metric again, and compare it with other baselines.

waystogetthere commented 1 year ago

Exactly, I have revised the log-likelihood function. Now its output is [BATCH, Len, Num_types]. I am running the code currently and hope it works.

ritvik06 commented 1 year ago

In addition to the Non-event LL term, the same issue persists in the event LL as well (eq. 8 in paper). Wherever lambda is written in the paper, it should be Sigma (lambda_k) for all k event types.