-
The canonical dataset does not incorporate spatial information, which is an important source of information for predictions. Other research with a similar target have used methods like [spatial autor…
-
Calling `ar_log_likelihood` on `x` produces values that are >0. Log probabilities should be upper bounded at 0.
```
x = np.copy(results[key]['latent_state'])
print(x.shape)
Ab = np.copy(c…
-
Hi all,
thank you for your development time-series python library.
I want to forecast with freq=MS.
I make neuralprophet model below. My data is monthly frequency data (2020-01-01, 2020-02-01…
-
According to Sklearn's documentation:
> Least-angle regression (LARS) is a regression algorithm for high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshir…
-
**Submitting author:** @pat-alt (Patrick Altmeyer)
**Repository:** https://github.com/pat-alt/CounterfactualExplanations.jl
**Branch with paper.md** (empty if default branch):
**Version:** v0.1.14
**…
-
-
Hi,
Thanks for the amazing work on streaming-llm. While reading the paper, I came up with this question on why applying "attention sink" also works with models with alibi position embedding.
One o…
-
dataloader. get_next function does not have the parameter 'sample_interval'.
@Shiduo-zh
https://github.com/Tsinghua-MARS-Lab/transformer4planning/blame/fb556ef8f03a32e7ff3f616674509298248518c4/…
-
Hi there,
I understand autoregression which outputs words one by one. With some manual benchmark, our deployment gives 50 English words in 6 seconds. Is there a way to optimize this? We plan to us…
-
@lucidrains
This is a issue I'm having a while, the cross-attention is very weak at the start of the sequence.
When the transformer starts with no tokens it will relay on the cross-attention but un…