unicode-org / message-format-wg

Developing a standard for localizable message strings
Other
209 stars 32 forks source link

Add new test schema #767

Closed mradbourne closed 2 months ago

mradbourne commented 2 months ago

This PR adds a schema for the JSON test files and refactors the existing test files to match the schema.

The schema is designed for easy interoperability with the unicode/conformance test suite. This is not a mandatory requirement but a nice-to-have.

Most of the content removed from the README is now included within schema.json.

The format of params has changed. They now use a { type, name, value } structure instead of { name: value }. Although instances of { "type": "string", "name": "foo", "value": "string-value" } may seem unnecessary, this pattern allows us to specify params that are not part of the JSON spec (e.g. { "type": "datetime", "name": "foo", "value": "2006-01-02T15:04:06" }.

The TypeScript .d.ts files have been removed for now because they no longer match the JSON content. I'd be happy to re-add them if/when needed.

There is a Visual Studio Code settings file included. Although contributors use many editors and IDEs, I thought it would be useful to provide at least one editor/schema integration out-of-the-box.

mihnita commented 2 months ago

Some comments about the overall structure (we can fine tune later):

1. I found the ability to group tests together in the same file pretty handy before (the way data-model-errors.json and test-functions.json used to be

So instead of one hard-coded "tests" array, can we make that a map, where the key can be anything, not just "test"?

And with "defaultTestParams" at group level.


2. In my implementation of ICU4J MF2, when trying to use the existing .json files, I found that some source strings are extremely long, unreadable.

It would be handy (and something I did in ICU) to allow an array of source(s) as an alternative to "src"

Compare:

"src": ".input {$exp :datetime timeStyle=short}\n.input {$user :string}\n.local $longExp = {$exp :datetime dateStyle=long}\n.local $zooExp = {$exp :datetime dateStyle=short timeStyle=$tsOver}\n{{Hello John, you want '{$exp}', '{$longExp}', or '{$zooExp}' or even '{$exp :datetime dateStyle=full}'?}}",

to:

"srcs": [
    ".input {$exp :datetime timeStyle=short}\n",
    ".input {$user :string}\n",
    ".local $longExp = {$exp :datetime dateStyle=long}\n",
    ".local $zooExp = {$exp :datetime dateStyle=short timeStyle=$tsOver}\n",
    "{{Hello John, you want '{$exp}', '{$longExp}', or '{$zooExp}' or even '{$exp :datetime dateStyle=full}'?}}"
],

3. When testing the results would be good to be able to specify expected results in a more flexible way

That is to accommodate for different results between implementations, while still testing that the "plumbing" works and that we really format to a date / time / whatever.

Things like "one of": "expOneOf": [ "9:30 PM", "9:30\u202FPM" ] Or reg-exp: :"expRegExp": [ "9:30.[Pp].[Mm]"]


4. Allowing for an optional "comment" or "description" in each test

mradbourne commented 2 months ago

@mihnita Thanks for the comments! I'm replying here to capture our conversation from the WG meeting:

1. I found the ability to group tests together in the same file pretty handy before (the way data-model-errors.json and test-functions.json used to be

As this is a non-breaking change, we'll keep it in mind as we write tests and propose it in a separate PR if needed.

My only concern was that having multiple defaultTestParams properties might make the test inputs more difficult to establish at a glance. The trade-off is less flexibility in the test file though. Lets revisit with some example tests that need that flexibility. I've added a description property to each test that might help with grouping (albeit not nesting) related tests.

2. In my implementation of ICU4J MF2, when trying to use the existing .json files, I found that some source strings are extremely long, unreadable.

It would be handy (and something I did in ICU) to allow an array of source(s) as an alternative to "src"

Done 👍

3. When testing the results would be good to be able to specify expected results in a more flexible way

As with point 1, we can discuss this non-breaking change up later. The first set of spec tests should be nice and deterministic. The oneOf would be for tests where results vary by implementation.

4. Allowing for an optional "comment" or "description" in each test

Done 👍 . Thanks!

echeran commented 2 months ago

The PR looks good. I want to add automated guarantees that make use of this PR's schema to ensure that our test files are well-formed JSON and adhere to the schema. So I created https://github.com/mradbourne/message-format-wg/pull/1

That PR is in @mradbourne 's personal fork with a destination of this PR. PTAL before merging.

eemeli commented 2 months ago

I think I'd prefer a schema directory like test/schemas/v0/ (with integer versions) or test/schemas/v0.0.1/ instead of test/schemas/v0-0-1/, but that's a pretty minor quibble and can be addressed later.

mradbourne commented 2 months ago

@echeran Thanks for raising https://github.com/mradbourne/message-format-wg/pull/1 I'd like to do this as a separate PR to unicode-org/message-format-wg as it raises some new questions/decisions.