open-formulieren / open-forms

Smart and dynamic forms
https://open-forms.readthedocs.io
Other
34 stars 25 forks source link

Normalize (variable) data types #2324

Open sergei-maertens opened 1 year ago

sergei-maertens commented 1 year ago

See also #2251 See also #2305 (using local type information instead of requiring formio definitions would be very useful)

In short - we need to ensure that all (input) data has the correct type for processing in python and when calculations are done with json-logic, we need to serialize back to json-types and all data needs to be properly normalized.

We have essentially the following flow of information:


                                             +------------------+
+------------+      +----------------+       | Logic evaluation |____\ - output data (JSON)
| Input data |____\ | backend        |____\  | (JSON)           |    / - updated variable values (python/JSON)
| (JSON)     |    / | (python types) |    /  +------------------+
+------------+      +----------------+        ^
                                              |
                    +----------------+        |
                    | logic rules    |________+
                    | (JSON)         |
                    +----------------+

The 'problem' with JSON is that it only has a number of primitives that are used for richer Python types:

This is further complicated with the formio component types and the notion of single/multiple values (array vs. primitive).

Using the python datatypes

Simply just using JSON types (complex & primitives) is not sufficient because we cannot do smart operations on them. We must support (non-exhaustive list) the following operations:

Identified boundaries

We can identify "our own code" as the system boundary. This implies:

note: 'as JSON' implies the result of json.loads(...) here, so we have python dicts/lists/strings/ints/floats/NoneType.

handling JSON logic

The JSON logic library essentially operates on JSON primitives (or complex objects) and we should deal with that. This is particularly challening when comparing datetimes (or dates and datetimes) for example:

2022-11-08T14:12:00+01:00 is equal to 2022-11-08T13:12:00+00:00 and 2022-11-08T13:12:00Z - but simple string operations will not give the same result.

We need to normalize JSON logic expressions with the available type information at save time so that runtime is as simple as possible:

A conclusion may be that we need to pass python-objects (datetimes) down to json logic rather than just serialized versions.

Tasks taken from refinement

SilviaAmAm commented 1 year ago

Additional thoughts: At the moment we have:

Before my PR, we were using python data for the json logic. If the PR is merged we would have:

SilviaAmAm commented 1 year ago

Additional example: dealing with currency / number component

sergei-maertens commented 1 year ago

blocked until we get the go-ahead for this

SilviaAmAm commented 1 year ago

Another issue related to types: https://github.com/open-formulieren/open-forms/issues/2707

joeribekker commented 1 year ago

The work on this cannot block releasing intermediate versions. So, working on this needs to be done outside of master OR as feature flag.

sergei-maertens commented 1 year ago

Chris mentioned that we can essentially do some type inference and "rewrite"/compile/transpile JSON logic expressions into equivalents that can be evaluated on both backend and frontend.

E.g. a string datetime + relative delta -> convert to unix timestamp (number) + delta (number) & the result can then be compared in terms of primitives.

joeribekker commented 1 year ago

To apply logic:

data (1) -> convert (2) -> preproces (3) -> jsonlogic (4) -> convert (5) -> result (6)

To load data from the database / load data from the user submission:

data (1) -> convert (2) -> result (6)
  1. JSON
  2. Convert JSON to Python data, using datatype metadata
  3. Convert Python to JSON suitable for JsonLogic
  4. Execute JSON-logic with logic rules
  5. Convert JSON result from JSON-logic to Python data, using datatype metadata
  6. Use Python data in templates, component labels, API serializers

Example:

joeribekker commented 1 year ago

Refinement: We should create concrete things todo (=issues) with this ticket as the main epic.