mulesoft-labs / data-weave-rfc

RFC for the data weave language
13 stars 5 forks source link

Introducing source compatibility checks #43

Open manikmagar opened 2 years ago

manikmagar commented 2 years ago

Following is the usual script structure for DataWeave 2. The very first line of this construct is declaration of DataWeave 2 version. AFAIK, it accepts any valid DW version 2.x declaration where x is a minor DataWeave version eg. 2.0, 2.2 etc.

%dw 2.0
output application/json
---
{ }

Now consider following script using the dw::core::onNull function introduced in version 2.4 -

%dw 2.0
output application/json
---
null onNull "DataWeave"

Mule application using above script will work fine when deployed to Mule 4.4 onwards.

The concern here is that there is no way to enforce source compatibility checks similar to java source compatibility checks in maven.

I would like to propose a version check for source compatibility and help developers to avoid using functions introduced in newer versions unless explicitly selected by upgrading the version.

So the following script should just fail the compilation or similar to catch it early -

%dw 2.3.   // --- Explicit declaration of using 2.3 version of DataWeave
output application/json
---
null onNull "DataWeave"

This can avoid developers accidentally using functions that introduced in 2.4 unless they knowingly change the version in the header. The code reviewer can also catch this version upgrade and have an opportunity to ask why such change was made and are appropriate runtimes available in target environment.

One challenge with this is the backward compatibility for most (if not all) dataweave 2 scripts written ever. They all have % dw 2.0 as autogenerated/default header. Enforcing source compatibility might just break every DataWeave 2 script.

As an incremental feature, 2.0 can be treated as 2.* wildcard and thus continue to support all existing scripts. If anything above 2.0 is used, then source compatibility is enforced. In that case, the above example of using 2.3 should result in compilation error.

As a best practice, teams can then start enforcing the review rule to use an explicit dw version for all new or modified scripts.

machaval commented 2 years ago

Hi @manikmagar Thank you so much for bringing this problem. We have been working on this for several month to try to get into a proper solution.

Details

The way we have been approaching the problem was by dividing the problem.

Language Level

As you mention there is the version you want to code against, we call this the "language level". This represents a visibility constrain. It describes what can and what is not visible from your program. It is not just functions but also any new feature like update operator or anything new. The language level doesn't operate at the file level but at the project level. Because if you have a mapping you also need to make sure that all dependencies are under the same language level. And if you want to make an upgrade to a new version you don't want to go file by file.

Syntax version

The syntax is the one that determines the parser to be used. A parser is something that takes a String as an Input and outputs an AST. The AST is the internal representation of the code, so once the String is transformed to an AST the syntax is no longer needed. What this gives us is a potential alternative to support different sintaxis (a breaking change) with the same runtime and make them interoperable. This way we can start evolving the syntax without the need to break anyones code and keep evolving into a modern syntax or fix something that we weren't able to do it in the past.

Runtime behavior (run as)

A third property is the runtime compatibility. One of our number 1 principles is not to break any customer app. Now this is very hard because in theory any bug fix is potentially a breaking change. What we are working on is a way to say this app was developed for 2.3, so it should behave like that independently if I execute it with 2.4 or 2.5. This will assure our customers to be able to upgrade to latest version without being frightened of their apps to break.

All this properties are separate but they are related. For example you can not use a bigger version of the syntax version than your language level or runtime level.

TLDR

The objective of the initiative we are working on is

  1. Be able to guarantee compatibility as much as possible
  2. Be able to evolve the language without constrains

So the principle is if you want build something is should run as is in all newer versions. But if you want to use new features you need to migrate. Now this migrations should be very low cost or 0 and tooling should help our devs to do it. But in some cases there may be some human interaction needed.