https://paperswithcode.com/paper/very-deep-vaes-generalize-autoregressive-1
XLnet is arguably the state of the art language model and is autoregressive. I wonder if the observations that deep VAEs can generalize and outperform autoregressive models on computer vision, can transpose to language models.
@zihangdai
I am posting this here instead of on the XLnet repository because you are not active on it.
https://paperswithcode.com/paper/very-deep-vaes-generalize-autoregressive-1 XLnet is arguably the state of the art language model and is autoregressive. I wonder if the observations that deep VAEs can generalize and outperform autoregressive models on computer vision, can transpose to language models. @zihangdai I am posting this here instead of on the XLnet repository because you are not active on it.