Blocked at implementing torch port of mx/distributions/mixture.py

pbruneau commented 2 months ago

Description

I am currently trying to port gluonts.mx.distributions.mixture.py to torch, aiming at using the ported MixtureDistributionOutput with NormalOutput components and DeepAR. I came up with the following implementation: https://gist.github.com/pbruneau/3c9d62f694c50ead8da7adf50014d13a

Basically I focused on implementing the methods that looked essential in the context of DeepAR. My debugging sessions (mixture of 3 NormalOutput components, and private data with 3 dynamic real features) seem to show that it works correctly (i.e., parameters associated to the 3 components seem to fit independently). However in terms of performance, on a (private) benchmark where the MXNet version of DeepAR gets significant performance boost with 3 Gaussian output components versus a single component, my PyTorch version seems to be only on par with a single Gaussian: in other words, I currently don't do better that using NormalOutput() as my output distribution.

I'm quite at a loss as to where to investigate next... Iso debugging MXNet and PyTorch versions of DeepAR is not trivial at all, as they operate quite differently. I could do with any suggestion (some important feature/method missing, a blatant mistake, where to investigate next, ways to iso-debug with/or some appropriate public benchmark)! If I can get this on rails, I would be more than happy to issue a pull request about it.

kashif commented 2 months ago

have a look at: https://github.com/zalandoresearch/pytorch-ts/blob/master/pts/modules/distribution_output.py#L238

pbruneau commented 1 month ago

Thanks @kashif and sorry for not replying, I have been recently absorbed by other matters! I'll have a look and reach back if needed very soon I hope!

awslabs / gluonts

Blocked at implementing torch port of mx/distributions/mixture.py #3160

Description