Enable invoking JS from the server in a way that is compatible with CSP and avoids XSS

This ticket suggests a new low-level mechanism in Flow that would allow defining all JS expression to invoke as static strings with no possibility of causing XSS issues by accidentally including user-provided values in the actual JS expressions. By extension, this means that all used JS can be known when the production build is created which in turn makes it possible to compute hashes to use with CSP.

The core idea is to avoid page.executeJs("window.alert($0)", "Hello"); which cannot be known in advance and which might accidentally be written as page.executeJs("window.alert('" + greeting "')"); which might lead to XSS. Instead, all JS snippets would be defined either in separate JS module files or as oneliners in annotations.

For oneliners, one alternative might be to define an interface.

public interface GreeterJs {
  @JsExpression("window.alert($0)")
  void showGreeting(String greeting);
}

Flow could then provide a way of creating an instance of that interface to actually invoke the JS:

page.getJsInvoker(GreeterJs.class).showGreeting("Hello");

Internally, this could work so that generated-flow-imports.js would contain something like this:

window.Flow.jsInvokers["com.acme.GreeterJs"] = {
 "showGreeting": $0 => window.alert($0)
}

When the Java method is run, a tuple like ["com.acme.GreeterJs", "showGreeting", ["Hello"]] could be sent to the client, based on which Flow could do window.Flow.jsInvokers[tuple[0]][tuple[1]).apply(null, tuple[2]) to invoke the JS without having to use eval or any similar mechanism that is unfriendly to CSP.

A corresponding concept could be used to instead define the invokeable JS that is exported from separate JS module file. In that case, the module file could be defined like this:

export function showGreeting(message) { window.alert(message) };

while the corresponding Java interface would instead use an annotation to define the JS module it should be bound to:

@JsExpressionModule("./myModule")
public interface GreeterJs {
  void showGreeting(String greeting);
}

Invoking from Java would be done in exactly the same way.

The code generated into generated-flow-imports.js would in this case be something like this:

import * as generated123 from './myModule';
window.Flow.jsInvokers["com.acme.GreeterJs"] = generated123;

The same mechanism could also be used to replace Element.executeJs with a small enhancement. What's needed is just to pass the client-side element instance as this (i.e. as the first argument to apply) if invoked in this way:

someElement.getJsInvoker(GreeterJs.class).showGreeting("Hello");

For this to make sense, the targeted JS expression would of course also have to make use of this.

An additional benefit of this approach is that @JsExpression scripts will be included in the bundle processed by webpack, which means that newer JS features can be used without having to take browser compatibility into account.

There are also some other features in the framework that accept JS expressions from the server, and which would thus have to be handled in some other way to mitigate the risk of XSS through mistakes in application code and also to enable using a strict CSP setting.

@EventData and @ModelItem values used in association with PolymerTemplate event handler methods can define expressions that define how to extract values that are to be sent to the server. These are always annotations which means that we can detect them up-front. We could build up a map from expression strings found during the build phase to the corresponding JS functions that would be be included in the bundle. Based on this map, the client can find the right existing JS function to use in place of an expression string received from the server.

Page::addDynamicImport takes a JS expression that is assumed to trigger a dynamic import() and return a corresponding Promise. This is identical to how executeJs works except that it's invoked in a different phase of the response processing cycle. This can be handled using exactly the same mechanism but a different trigger method, e.g. page.getDynamicImporter(GreeterJs.class).showGreeting("Hello") assuming the client-side implementation of showGreeting returns a Promise.

The most challenging case is the event data expressions and filters provided through DomListenerRegistration. These can be defined through the @DomEvent and @EventData annotations but they can also be defined programmatically. There's also some additional complexity with using a dynamic value such as a key code in a filter. A typical example might be something like this to replicate the existing functionality for shortcut keys. It listens for a specific key to be pressed, and in that case also sends back whether some specific modifier has been pressed.

UI.getCurrent().getElement().addEventListener("keydown", event -> {
  System.out.println("'" + filterKey + "' was pressed. Was Ctrl pressed? " + event.getEventData().getBoolean("event.ctrlKey"))
}).setFilter("event.key=='" + filterKey + "'").addEventData("event.ctrlKey");

Doing something sensible for this case requires more than just reapplying the old patterns based on a new format. For many simple cases, all that is needed is a way of extracting some specific properties from the event object (e.g. event.ctrlKey). This could be supported simply by giving a list of property names to extract:

someElement.addEventListener("click", event -> System.out.println(event.getEventData().getBoolean("event.ctrlKey"))
    .synchronizeEventProperties("ctrlKey", "shiftKey");

Evaluating these in the browser don't need to evaluate code since it just requires accessing some properties by name.

A more complex mechanism is needed to access sub properties, extract data from the element rather than the event, run arbitrary logic (e.g. event.preventDefault()), or to define a boolean expression for filtering the event (i.e. determining whether the event should be delivered to the server). Finally, there's a need to capture parameters from the server to be used by the client-side logic.

Rather than treating each "complex" event property as a separate entity, it might make more sense to have one big JS expression that collects all relevant properties (in addition to those defined as property names for the event object) and works as a filter as well. This would thus be a JS function that can return a truthy value to pass the filter and furthermore if the returned value is an object, then that object is used as the foundation for the event data properties to use. A falsy return value signals that the filter doesn't pass which also means that no further event data properties are needed.

The above example could thus be satisfied by this JS snippet as long as event evaluates to the event and $0 is the filter key value passed from the server:

if (event.key == $0) {
  return {
    ctrlKey: event.ctrlKey
  };
} else {
  return false;
}

or more compactly return event.key == $0 && { ctrlKey: event.ctrlKey }

This JS snippet would have to be defined in a way that is findable during the build and that can be referenced alongside any parameter value from the server through an instance of DomListenerRegistration. For an inline snippet, this could be in the form of an empty Java type with the JS snippet in an annotation. This structure is slightly silly since the class would only be used as a way of passing a runtime reference to the annotation.

@EventExpression("return event.key == $0 && { ctrlKey: event.ctrlKey }")
public interface MyEventExpression {}

Usage would then be along these lines:

UI.getCurrent().getElement().addEventListener("keydown", event -> {
  System.out.println("'" + filterKey + "' was pressed. Was Ctrl pressed? " + event.getEventData().getBoolean("ctrlKey"))
}).setEventExpression(MyEventExpression.class, filterKey);

If using a standalone JS module file instead of an inline expression, then the definition could be like this for using the default export from the module.

@EventExpressionModule("./myEventExpressionModule")
public interface MyEventExpression {}

To use a named export from the module, then the name of the export can be defined as an optional parameter in the annotation.

@Legioth are there any ideas to implement this until the next LTS or is this just your brainstorming / idea storage? 😉

There are no specific plans at this moment - I'm just poking at things that might be interesting and seeing if there's any wider interest among users. I plan to publish something related on vaadin.com/labs later this week to get something more tangible to refer to.

If there is enough interest within the next weeks or months, then I believe a basic implementation could be thrown together quickly enough to get into the next LTS. The challenges here are more on the conceptual side whereas an actual implementation should be quite straightforward if I'm not mistaken.

As an alternative for someone really eager to use strict CSP settings even if it adds some boilerplate to their application, there's also the fallback mechanism in https://github.com/vaadin/flow/issues/10810 that could be implemented in a couple of hours. That approach could be added even to 14.x.

As an alternative to the special @EventExpression and @EventExpressionModule logic to define event expression handlers, it might also be possible to use a "regular" @JsExpression or @JsExpressionModule method provided the client-side implementation of that method returns a callback.

The example with key filtering could thus look like this:

public interface MyElementExpressions {
  @JsExpression("return (event) => event.key == $0 && { ctrlKey: event.ctrlKey }")
  void eventFilter(String key);
}

For using this kind of expression, I see two different design alternatives with different tradeoffs.

One approach would be that DomListenerRegistration has a method for creating a proxy instance specifically for that purpose, and then this proxy can be used to trigger the actual invocation on the interface. The problem here is that this return value destroys chaining on DomListenerRegistration:

DomListenerRegistration registration = UI.getCurrent().getElement().addEventListener("keydown", event -> {
  System.out.println("'" + filterKey + "' was pressed. Was Ctrl pressed? " + event.getEventData().getBoolean("ctrlKey"))
});
registration.getEventExpressionFactoryInvoker(MyElementExpressions.class).eventFilter(filterKey);

Chainability could be preserved with a really awkward approach with a Java Consumer, but I'm not sure that makes sense

UI.getCurrent().getElement().addEventListener("keydown", event -> {
  System.out.println("'" + filterKey + "' was pressed. Was Ctrl pressed? " + event.getEventData().getBoolean("ctrlKey"))
}).withEventExpressionFactoryInvokerFactory(MyElementExpressions.class, proxyInstance -> proxyInstance.eventFilter(filterKey));

Another approach would be to accept a PendingJavaScriptInvocation in which case eventFilter would have to be changed to return that type. Usage could then be like this:

Element element = UI.getCurrent().getElement();
element.addEventListener("keydown", event -> {
  System.out.println("'" + filterKey + "' was pressed. Was Ctrl pressed? " + event.getEventData().getBoolean("ctrlKey"))
}).setEventExpression(element.getJsInvoker(MyElementExpressions.class).eventFilter(filterKey));

Either of these compromises might still be worthwhile to avoid introducing completely new annotation scanning cases and mechanisms only for this quite rare case.

@Legioth I've read your labs entry and I was wondering - if someone isn't actively adding custom javascript calls.. is it already possible to use Vaadin 14 with CSP with the default flow-components or are there some components that do "magic" under the hood?

All my production apps with CSP requirements are still V8, so I had no change to tamper with V14 in this regard.

Vaadin components are typically only using hardcoded JavaScript snippets that can be known in advance. One would just have to verify whether any expressions have changed whenever updating to a new Vaadin version or starting to use a component that hasn't been used previously. I specifically reviewed Grid and ComboBox to have two quite complex cases and I didn't spot anything there that would dynamically generate JS expressions to run.

The one exception that I'm aware of is ShortcutRegistration which is dynamically building event filter expressions based on the key and modifiers to listen for. Even a case like this could still be supported. If the application only uses a handful of different shortcut keys, then it would might make sense to just hardcode for the particular expressions that are generated for those keys. Supporting arbitrary shortcut keys could also be possible by building a simple parser that recognizes the structure used by shortcut expressions to extract the variable parts and then returning a regular function that uses those values from its own closure.

@Legioth looks like mentioned shortcuts were hit by a customer today. https://discord.com/channels/732335336448852018/1120833928996081714

Interesting case since it's based on the request rather than the response. Those "JS snippets" are just map keys that define what JS he client was running to produce each value.

A simple implementation of the concepts imagined here might still pass event parameters in the same way as now so it might be necessary to come up with some alternative if we also want to help avoid triggering this type of detection.

vaadin / flow

Enable invoking JS from the server in a way that is compatible with CSP and avoids XSS #10759