SeldonIO / alibi-detect

Algorithms for outlier, adversarial and drift detection
https://docs.seldon.io/projects/alibi-detect/en/stable/
Other
2.25k stars 225 forks source link

Remove `scale_identity_multiplier` to support `tensorflow-probability 0.20` #828

Closed ascillitoe closed 1 year ago

ascillitoe commented 1 year ago

In tensorflow-probability v0.20 the scale_identity_multiplier kwarg given to MultivariateNormalDiag is removed, meaning the covariance matrix must be specified via scale_diag (see https://github.com/tensorflow/probability/commit/ca8485023767fc021e277ca1a987449d90f21cea)

This PR modifies our elbo loss function, so that scale_diag is constructed from the sim kwarg, rather than passing it directly to scale_identity_multiplier. Currently, only one of cov_full, cov_diag or sim should be given to elbo.

Breaking change: This currently involves a breaking change since the default value of sim in elbo has been changed from sim=0.05 to sim=None. A possible alternative would be to have sim=0.05 as the default, and then set sim=None under-the-hood when either cov_diag or cov_full are given (i.e. not None).

ascillitoe commented 1 year ago

Sanity check on od_vae_cifar10.ipynb. VAE training history with alibi-detect==0.11.3 and tensorflow-probability==0.19:

782/782 [=] - 68s 77ms/step - loss_ma: 8641.5905
782/782 [=] - 59s 75ms/step - loss_ma: -2348.6304
782/782 [=] - 57s 72ms/step - loss_ma: -3497.9333
782/782 [=] - 57s 72ms/step - loss_ma: -4035.8458
505/782 [.] - ETA: 20s - loss_ma: -4364.4066

History with this PR:

782/782 [=] - 66s 68ms/step - loss_ma: 9755.7217
782/782 [=] - 52s 67ms/step - loss_ma: -2190.6944
782/782 [=] - 52s 66ms/step - loss_ma: -3428.9409
782/782 [=] - 52s 67ms/step - loss_ma: -3992.8471
478/782 [.] - ETA: 19s - loss_ma: -4276.0701
codecov[bot] commented 1 year ago

Codecov Report

Merging #828 (5337d4b) into master (9256aac) will increase coverage by 0.68%. The diff coverage is 100.00%.

:exclamation: Current head 5337d4b differs from pull request most recent head 4379868. Consider uploading reports for the commit 4379868 to get more accurate results

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/SeldonIO/alibi-detect/pull/828/graphs/tree.svg?width=650&height=150&src=pr&token=gwntCwhaGT&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=SeldonIO)](https://app.codecov.io/gh/SeldonIO/alibi-detect/pull/828?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=SeldonIO) ```diff @@ Coverage Diff @@ ## master #828 +/- ## ========================================== + Coverage 81.16% 81.85% +0.68% ========================================== Files 148 159 +11 Lines 9834 10321 +487 ========================================== + Hits 7982 8448 +466 - Misses 1852 1873 +21 ``` | [Impacted Files](https://app.codecov.io/gh/SeldonIO/alibi-detect/pull/828?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=SeldonIO) | Coverage Δ | | |---|---|---| | [alibi\_detect/models/tensorflow/losses.py](https://app.codecov.io/gh/SeldonIO/alibi-detect/pull/828?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=SeldonIO#diff-YWxpYmlfZGV0ZWN0L21vZGVscy90ZW5zb3JmbG93L2xvc3Nlcy5weQ==) | `82.08% <100.00%> (+1.13%)` | :arrow_up: | ... and [29 files with indirect coverage changes](https://app.codecov.io/gh/SeldonIO/alibi-detect/pull/828/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=SeldonIO)
jklaise commented 1 year ago

I think it's fine to have this change be breaking - it would be more unexpected behaviour if we kept it to 0.05 but then implicitly would disregard it if another parameter is passed, I like the current validation of only one parameter being set. Plus I think the actual elbo loss is not in our public API, in which case we could make it "not breaking" by just updating our library callsites to set sim=0.05? Not sure what the trade-off is, would suggest checking with the person who first implemented the default args.

Would be good to check all call-sites in the library and on the docs that utilize the elbo loss to check that everything works as expected with the now changed default if we do go with the change.

ascillitoe commented 1 year ago

Thanks @jklaise. Re our library callsites for elbo, we set it's kwarg's in the vae outlier detectors, e.g.

https://github.com/SeldonIO/alibi-detect/blob/8b2dcbb5c1bfe1a29db52e42c4511e9c4937ed8a/alibi_detect/od/vae.py#L81

But we're already setting sim=0.05 as you describe, so I think we're OK. Since @arnaudvl implemented the loss function I'm tagging him as a reviewer to confirm.