lann / wasm-wave

Web Assembly Value Encoding
Apache License 2.0
38 stars 6 forks source link

Another Multi-line/raw string syntax #26

Closed sunfishcode closed 9 months ago

sunfishcode commented 10 months ago

I think #21 is workable, and have posted some suggestions for it. I also wanted to file this issue to brainstorm in the direction of BigWave's multi-line/raw string syntax.

> here is a raw string
field:
   > here is a
   > multi-line string that is a value
   > of a record field

For a more complete example, here's an example of the syntax in #21:

  {
    build: %"
      node -c "console.log('hello, world!');"
      echo "foo" > some-file.txt
    %"
  }

and the same code with BigWave strings:

  {
    build:
      > node -c "console.log('hello, world!');"
      > echo "foo" > some-file.txt
  }

Advantages of %":

Advantages of >:

I'm not strongly attached to any of the specifics here, I just wanted to brainstorm around this direction.

lann commented 10 months ago

Doesn't need a special end-of-file rule to know whether the file has been truncated.

Could you explain this further?

lann commented 10 months ago

I had an immediate negative reaction to this syntax aesthetically but it is growing on me. I'll keep thinking about it.

lann commented 10 months ago

Behavior is insensitive to indentation.

I'm not sure that this is an advantage for this particular syntax. If I were implementing this I would be tempted to require that all >s be on the same column...

sunfishcode commented 10 months ago

Doesn't need a special end-of-file rule to know whether the file has been truncated.

Could you explain this further?

Files can get truncated for lots of reasons; out of disk space while writing, download interrupted, another process doing a non-atomic update, etc. In YAML, a file can be truncated and silently have valid syntax, which is a problem I'd ideally like Wave to avoid. And Wave already does avoid this, due to start and end delimiters.

But, if we added a multi-line > string and it can appear at the very end of a file, there could be situations where you don't know if you've seen the whole string.

My idea for fixing this problem was to add a rule that multi-line strings at the end of a file have to end with two newlines. It wouldn't come up all that often, and tools could give good errors when it does. But it is a weird little special case.

lann commented 10 months ago

How would inter-element commas work with this syntax? One obvious answer:

{
  field-a:
    > field-a's value
  ,
  field-b:
    > field-b's value
  ,
}

Edit: I see from your BigWave link:

When used as a record value, the comma that would follow a multi-line string is omitted

sunfishcode commented 10 months ago

Yep, my idea there was to add a special case and just say record fields with multi-line strings don't need commas. Which kinda looks nice in realistic settings.

But it is also a special case, and it doesn't work for lists, so we'd still have

[
   > red
   > orange
   ,
   > yellow
   > green
]

compare to:

[
   %"
      red
      orange
   "%,
   %"
      yellow
      green
   "%
]

:shrug:

sunfishcode commented 9 months ago

In the design above, multiline strings are always raw. Contrast with #29 where """ is multline and %""" is multiline raw.

It feels like many multiline string use cases don't want to worry about escaping quotes or backslashes. But, there might be a need for Unicode escapes sometimes. Maybe a non-raw multiline string could be indicated by %>?

description:
    %> Old pond\u{U+2026}
    %> A frog jumps in
    %> water sound
sunfishcode commented 9 months ago

I've now done some informal polling, and this format doesn't seem to be catching people's eyes, so I'm inclined to go with #29 and see how it goes.