projectfluent / fluent.js

JavaScript implementation of Project Fluent
https://projectfluent.org/
Apache License 2.0
937 stars 77 forks source link

Statically analyse a bundle #460

Open kevinresol opened 4 years ago

kevinresol commented 4 years ago

After adding resources to a bundle, how to statically analyse the resources to ensure validity?

For example:

bundle.validateMessage(id);

This will traverse the message AST recursively and make sure it is valid. For example: referenced terms exist and are valid, function call arity is correct, etc.

macabeus commented 4 years ago

@kevinresol I think that adding types would resolve this problem too: https://github.com/projectfluent/fluent.js/issues/466

kevinresol commented 4 years ago

@macabeus Maybe I misread something but I don't think they are the same. What I wish is to have a function at runtime that validates the loaded resources in a bundle. For example the following resource (or more specifically, the message "foo") is invalid regardless of the host language (e.g. JS/TS/Java/C++,etc), because the referenced term does not exist.

-term-with-typo = My Value
foo = My name is {-term-without-typo}.
stasm commented 4 years ago

@kevinresol What's the use-case for running this analysis at runtime? At Mozilla, we use compare-locales to run static analysis on localization resources, but it runs in CI, in push hooks, and at build time. Would that satisfy your use-case?

macabeus commented 4 years ago

@kevinresol Okay, I understand now better what do you want. For example:

require = require('esm')(module);

const { FluentBundle, FluentResource } = require('./fluent-bundle/esm/index.js')

const resource = new FluentResource(`
-term-with-typo = My Value
foo = My name is {-term-without-typo}.
`);

const bundle = new FluentBundle("en-US");
const errors = bundle.addResource(resource); // currently it'll return an empty array,
                                           // but should return an error, because
                                           // we are using the term -term-without-typo
                                           // that doesn't exist

Is that?

@stasm I think that would be better add it built-in in fluent.js as welll as on fluent's spec, "should show warnings if using a term that doesn't exist"

Pike commented 4 years ago

We have a thread open on discourse on what to do on broken references, see https://discourse.mozilla.org/t/on-message-and-termreferences/53217.

On top of that, @stasm, do you remember how much of the optimism is left in the runtime parser? Aka, how often it will generate valid bundles for invalid Fluent?

kevinresol commented 4 years ago

@macabeus since FluentBundle supports adding resources multiple times, so it may happen that the terms referenced in the earlier resources will be added later. So you can't just emit the error/warning while adding. It has to be a separate step.

kevinresol commented 4 years ago

@stasm My use case is to perform upfront validation on the FluentBundle before passing it downstream to the actual localization call site (the place where formatPattern is called). My goal is to make sure no failure could occur at the call site.

stasm commented 4 years ago

@kevinresol How would you handle validation errors? They would still be runtime errors, so I'm not sure I see the benefits over reporting failures directly from formatPattern.

kevinresol commented 4 years ago

I want to ensure the loaded localization file is valid before the actual app even starts. This brings the benefits of:

  1. at the very beginning of the app we know there are something wrong, this helps translators/developers to figure out a problem asap
  2. error handling doesn't spread all over the place, at every translation call-site, assuming that formatPattern can't fail
  3. 466 solves a related yet different problem, it ensures the correct data types are passed to formatPattern, which eliminates one source of possible error. Yet it doesn't address problems from the loaded FTL. Also it is specific to TS, but every host language should has its own solution, and I have one for the language I use.

After all I think this is a general robustness concern. IMO, whenever something is dynamically loaded at runtime, it should be validated upfront, before passing to downstream consumers.