l20n / spec

Specification and design documents for L20n
http://l20n.github.com/spec/
10 stars 6 forks source link

Require at least one character in unquoted text #2

Closed spagy closed 7 years ago

spagy commented 8 years ago

Previously, an empty string was valid for both unquoted-text and quoted-text. While this was fine for quoted-text (it's surrounded by quotes), this didn't work work for unquoted-text as this would make an empty string a valid unquoted-pattern (and so also a valid pattern and for a valid value). Finally this makes

identifier =

a valid message. This disagrees with the current FTL tinker implementation.

Quoted-text has been updated to retain the symmetry with unquoted-text. Quoted-pattern has been updated so that

identifier = ""

is still a valid message.

stasm commented 8 years ago

Is there a way to allow for the following to be valid syntax?

identifier = 
  [keyword] pattern

It should parse into a message with a null value and a one-item trait list.

stasm commented 8 years ago

Maybe this?

message              ::= identifier __ '=' __ (value | trait-list);
value                ::= pattern (NL trait-list)?;
spagy commented 8 years ago

Is there a way to allow for the following to be valid syntax?

identifier = 
  [keyword] pattern

It looks like it's already valid to me. Admittedly I'm parsing in my head and may be going wrong.

It should parse into a message with a null value and a one-item trait list.

To me, this looks more like it should be a variant list than a message with a trait list. The lack of anything on the first line after the identifier and the equals sign seem characteristic of a list of variants.

 

Okay, I've reread the definition for a message and it doesn't seem to work completely.

As far as I can tell three main types of message should be accepted.

The following is the most basic and fits the current definition of a message. The value can be quoted or unquoted, it can include placeables and can be block text.

hello-world = Hello world

This defines variants so that different variants can be used based on the surrounding context of the sentence.

brandName =
 *[nominative] Aurora
  [genitive] Aurore
  [dative] Aurori
  [accusative] Auroro
  [locative] Aurori
  [instrumental] Auroro

This defines a trait so that different sentences can be used based on the traits of a value.

brandName = Firefox
  [gender] masculine

At the moment this grammar file seems to accept the first one and the second but not the third. With your suggested change, value ::= pattern (NL trait-list)?;, the first would be accepted, as would the third but the second one would not be accepted unless a pattern is allowed to be an empty space without quotes.

So if we implement your suggested change and not the change in the pull request then (I think) all three types would be accepted. Fantastic!

Only thing is, that definition would be imply that the second example above is an empty value with a list of traits. But it's not :/ It's a list of variants. Using trait-list to define a list of variants doesn't help do much but add confusion (at least for me).

So, extending your suggestion, how would this be

message              ::= identifier __ '=' __ (value | variant-list);
value                ::= pattern (NL trait-list)?;

Of course this would require variant-list to be defined and maybe some other definitions to me amended a bit but I think this would clear up some confusion (again, at least for me :)).

PS Though this does seem like a good thing to fix, I think it might be an issue for another pull request. This pull request was just trying to make

identifier =

invalid when there is no list of variants to follow.

stasm commented 8 years ago

I actually think all these concepts are closely related so if you don't mind let's continue discussing here.

In your second example, nominative etc. are actually traits. Variants are what we call members inside of a select expression. This might be a naming problem: admittedly the grammar document shouldn't concern itself with the meaning of symbols as given by the interpreter. Instead of having traits and variants would it make sense to only have a member-list of members?

spagy commented 8 years ago

In your second example, nominative etc. are actually traits.

Yep, I've reread some of the L20n docs, you're completely right. I was slightly confused by some of the stuff at http://l20n.org/learn.

For example, the page titled variants (where I got the nominative example), doesn't have a reference to select statements. Anyway I think that might be something to discuss in a different issue, as long as we agree on terminology, which I think we do now :)

Possibly renaming some terms in the grammar file would help to make things clearer but I think making sure the grammar is correct is more important.

So anyway, the question I've got now is why shouldn't identifier = be valid and have the same meaning as identifier = ""?

spagy commented 8 years ago

I know this is the wrong place to post, but would it be possible to open the issue tracker? There are a couple of other things that I think could be improved. I'd be happy to do them myself but it'd be nice to know that they were helpful and the right thing before starting work.

stasm commented 8 years ago

So anyway, the question I've got now is why shouldn't identifier = be valid and have the same meaning as identifier = ""?

It's the difference between a null value and an empty-string value. The distinction is important when we consider a tree structure like HTML:

<div data-l10n-id="foo">
    <div>…</div>
</div>

An entity with a null value and a trait will add the trait as an attribute:

foo =
    [html/title] Foo
<div data-l10n-id="foo" title="Foo">
    <div>…</div>
</div>

An entity with an empty-string value and a trait will add the trait as an attribute and clear all child elements of the outer div:

foo = ""
    [html/title] Foo
<div data-l10n-id="foo" title="Foo">
</div>

I know this is the wrong place to post, but would it be possible to open the issue tracker? There are a couple of other things that I think could be improved. I'd be happy to do them myself but it'd be nice to know that they were helpful and the right thing before starting work.

We usually use Bugzilla to track L20n work but I think the grammar is well-scoped and fairly independent, so let's use GitHub's Issues. I enabled them for this repo. Thanks!

spagy commented 8 years ago

Thanks for taking the time to explain. And thanks for opening the issue tracker. I don't really have a preference for using GH issues, I'd just forgotten that L20n was using Bugzilla. If using Bugzilla would be better, I'd be happy to use that. Otherwise I'll open some issues here.

Going back to your previous suggestion,

message              ::= identifier __ '=' __ (value | trait-list);
value                ::= pattern (NL trait-list)?;

I think that would work well, as long as this pull request is also applied (I think that's what you were implying anyway). Otherwise identifier = without a trait-list would still be valid.

It should parse into a message with a null value and a one-item trait list.

I'm assuming that if value gets no match then it'll be null. If so this looks good to me. I'll add your suggestion.

stasm commented 7 years ago

(Just an update: I'm traveling this week. I'll finish the review in a few days. Thanks for creating the other issues!)

stasm commented 7 years ago

I asked @zbraniecki for his input on this and we agreed that it was a good change. Thanks!

spagy commented 7 years ago

Thanks for the update and review