Fix formula expansion non-determinism

J08nY commented 3 weeks ago

Something is causing the expanded formulas to vary between runs/machines/Python's/???. Also, the number of expanded formulas varies.

Examine the equality implementation on formulas.
Examine the use of sets.

J08nY commented 3 weeks ago

Part of the issue is: https://github.com/J08nY/pyecsca/blob/master/pyecsca/ec/formula/expand.py#L18 where if there are two (or more) formulas with equal IV sets (which is our metric) we keep just one, but which one matters by iteration order over a set (so implementation dependent). This does not really matter with regards to the metric but it does change which instance of the formula we keep. This becomes a problem, because the formula has a name, which may be different if it was derived in a different way and the formula hashing and equality checks do use the name. Thus sets will consider them unequal. Which I guess may happen anyway, because their code attribute may be different even though the IV sets are the same.

Just a note.

J08nY commented 3 weeks ago

Also another observed fact: The same formula name in two runs does not mean the formula is structurally equal. I though that was the case, because the name reflects the history of the formula in how it was derived.

J08nY commented 3 weeks ago

Another observed issue: One run gave me a set of formulas that had a collision on the IV norm. As in two formulas in it had the same IV set, which should not happen.

Edit: This was two EFD formulas.

J08nY commented 3 weeks ago

We always keep the EFD formulas in the set when expanding, even though they have the same IV set. When expanding, this, together with the unguaranteed order of expansion may lead to the same formulas being derived from two different base EFD formulas (and thus different name).

J08nY commented 3 weeks ago

Fixed by https://github.com/J08nY/pyecsca/pull/68.

J08nY / pyecsca-artifact

Fix formula expansion non-determinism #2