i think your log likelihood is wrongly computed. i will state it from two aspects, first, you don't shift your sequence, we should use the first hidden to compute the second intensity, while you use the first hidden to compute the first intensity and you do it similarly when compute the integral term. Second, i wonder when you are doing monto carlo estimation, why do you execute the following code:
temp_time /= (time[:, :-1] + 1).unsqueeze(2)
i think your log likelihood is wrongly computed. i will state it from two aspects, first, you don't shift your sequence, we should use the first hidden to compute the second intensity, while you use the first hidden to compute the first intensity and you do it similarly when compute the integral term. Second, i wonder when you are doing monto carlo estimation, why do you execute the following code:
temp_time /= (time[:, :-1] + 1).unsqueeze(2)