april-tools / cirkit

a python framework to build, learn and reason about probabilistic circuits and tensor networks
https://cirkit-docs.readthedocs.io/en/latest/
GNU General Public License v3.0
71 stars 1 forks source link

Fix errors when considering an odd number of variables #85

Closed loreloc closed 1 year ago

loreloc commented 1 year ago

Closes #83 .

loreloc commented 1 year ago

Any idea why I get NaN as output here? (see Pylint report)

lkct commented 1 year ago

Any idea why I get NaN as output here? (see Pytest report)

strange. seems to happen by chance. maybe take a look at reproducibility issues?

loreloc commented 1 year ago

Any idea why I get NaN as output here? (see Pytest report)

strange. seems to happen by chance. maybe take a look at reproducibility issues?

How's that possible? We set a seed for the tests. Are we using torch.compile somewhere yet?

lkct commented 1 year ago

How's that possible? We set a seed for the tests. Are we using torch.compile somewhere yet?

See benchmark code. Seeding alone cannot guarantee full reproducibility. However I'm not sure what caused the problem in this case. Maybe inspect with pytorch's anomaly detection?

loreloc commented 1 year ago

How's that possible? We set a seed for the tests. Are we using torch.compile somewhere yet?

See benchmark code. Seeding alone cannot guarantee full reproducibility. However I'm not sure what caused the problem in this case. Maybe inspect with pytorch's anomaly detection?

We should use deterministic mode in tests and perhaps do them in no grad mode if we are not testing gradients. Still, having NaN should not happen even in non-deterministic torch mode.

I still get NaN when using deterministic mode.

loreloc commented 1 year ago

HEAD detached at 4b355a5 (#64) It works. HEAD detached at acbbce0 (#69) It works. HEAD detached at f312558 (#74) It works. HEAD detached at fb1ef41 (#78) It does not work (getting NaNs). Removing the padding for the mixing layer (in which cases do we need it?) resolves the issue.

lkct commented 1 year ago

https://github.com/april-tools/cirkit/blob/df0997428d7164f343b5582c0d7c532d30d093e2/cirkit/models/einet.py#L317-L320

This padding breaks the bookkeeping by changing the shape of the output. This dependence was introduced in #78.

For now we can only comment out this padding. Please change the TODO to state this change of shape breaks bookkeeping and we need to find another way to implement it.