AdaCore / RecordFlux

Formal specification and generation of verifiable binary parsers, message generators and protocol state machines
Apache License 2.0
104 stars 6 forks source link

Refactoring of expressions #1287

Open jklmnn opened 1 year ago

jklmnn commented 1 year ago

Currently our internal expression have two different conversion functions for Z3 and Ada expressions. To reduce the redundancy and improve adaption for further target languages this mechanism should be generic. The idea is to create a base class for language specific expressions that provided methods to create native language expressions:

class LangExpr:

    @classmethod
    @abc.abstractmethod
    def create_add(cls, items...) -> LangAdd:
        raise NotImplementedError

class LangAdd(LangExpr):
...

class AdaExpr(LangExpr):

    @classmethod
    def create_add(cls, items...) -> AdaAdd:
       ...
...

class AdaAdd(AdaExpr, LangAdd):
...

The conversion methods in Expr would then be replaced with a method that takes the desired class type and returns an expression of that type:


class Add(Expr):
    def to_lang_expr(self, lang_expr: type(LangExpr)) -> LangAdd:
        return lang_expr.create_add(self.items)

This approach may be subject to change while its feasibility is investigated.

treiher commented 1 year ago

How would you ensure, for example, that an AdaExpr is only composed of AdaExpr, if to_lang_expr just returns a generic LangAdd? I think what you would need are dependent types, so that the return type of to_lang_expr depends on the type of lang_expr parameter. IMHO the only way to achieve that in Python is using @overload, which means you would again need to define a function for each language.

jklmnn commented 1 year ago

With the exception of Z3 expressions the idea is that the returned concrete type is never accessed.LangAdd will provide an abstract method for code generation that can be called by the respective code generator. I'm aware that we may need different code generators for different languages but the idea is to keep them as generic as possible. E.g. the code that generates a check if a value is even could look as follows:

def generate_check(expr: expr.Name, lang: type(LangExpr)) -> str:
    lang_expr = Equal(Mod(expr, Number(2), Number(0)).to_lang_expr(lang)
    return str(lang_expr)

Depending on lang this code would emit either x mod 2 = 0 or x % 2 == 0 without the need to know which language the returned expression is. This of course requires a consistent usage of the type that is passed to these kinds of functions.

In case of recursive calls of different to_lang_expr functions the same lang_expr object would be used ensuring equal types.

Z3 is a bit of a special case here as we intend to use the expressions directly which would require at least assertions to get the correct types for mypy.

treiher commented 1 year ago

In case of recursive calls of different to_lang_expr functions the same lang_expr object would be used ensuring equal types.

So you propose to remove the restriction that expressions can only be constructed of expressions of the same language?

Currently, an Ada expression can only be constructed of Ada expressions. In your example, the constructor would look as follows:

class AdaAdd(AdaExpr):
    def __init__(self, *terms: AdaExpr) -> None:
        super().__init__()
        self.terms = list(terms)

To be able to recursively convert a RecordFlux expression into an AdaExpr using a to_lang_expr(self, lang_expr: type(LangExpr)) -> LangExpr would require to change the constructor as follows:

class AdaAdd(AdaExpr):
    def __init__(self, *terms: LangExpr) -> None:
        super().__init__()
        self.terms = list(terms)

So an AdaAdd could contain terms of the types FooNumber and BarAdd. I think being able to accidentally mix expressions of different languages would be a significant disadvantage.

jklmnn commented 1 year ago

I missed a detail here. AdaExpr inherits from LangExpr as it overrides create_add. to_lang_add does not call LangAdd() directly but calls create_add which will be dispatched. As create_add is implemented by AdaExpr it won't be able to call a constructor of a different implementation (unless deliberately importing and calling it, but that would then be caught by mypy as the items are AdaExpr too).

The only thing I'm not completely sure about is the implication of AdaAdd inheriting from both LangAdd and AdaExpr as both inherit from LangExpr.