pure-data / pure-data

Pure Data - a free real-time computer music system
Other
1.55k stars 241 forks source link

[feature request] curly brace support #505

Open avilleret opened 5 years ago

avilleret commented 5 years ago

hi,

I saw that backslash are now supported which is great, thanks @millerpuckette would it be possible to also support opening and closing curly brace ? those are really useful in some syntax context like regexp

for example, with libossia, man can control several parameters at once with [ø.remote foo.{1,2,4,15} ] which is quite convenient.

for now we have a workaround but curly braces are so nice :-)

porres commented 4 years ago

It seems at least a prerequisite for this has already been merged, see https://github.com/pure-data/pure-data/pull/647

I was about to request the same feature and saw this. Anyway, I wonder what else is on the way of implementing this, maybe I can try and help doing it...

porres commented 1 year ago

@umlaeute what's in the way to support this these days?

umlaeute commented 1 year ago
  1. technically i guess most hurdles are taken (but then, i haven't really checked). i thikn the current state is:

    • [x] allow display of curlies in the GUI (core -> gui communication) - this has already been solved
    • [ ] allow input of curlies (gui -> communication) - this is simple to solve it just requires us to drop the filter
    • [ ] escaping of curlies when (re)storing files (they shouldn't be escaped at all); this might already be solved (i haven't checked)
  2. however practically i think we should not lightly add "support for curly braces". the fact that curlies (that is: a pair of matching characters) have been forbidden until now is a unique opportunity, as it could allow for a simple and elegant way to write structured data in Pd patches. currently our only data structure we can express within a message are lists of atoms. but with curlies we could introduce list of lists without breaking backward compatibility:

    [list foo {1 2 3} bar(
    |
    [$1(
    |
    |list 1 2 3)

    I think this is much more interesting that allowing for regexp-groups such as used by OSC or ossia.

    note that this is not mutually exclusive; we can have both single curlies denoting some special syntax (like the lists-of-lists above) and curlies within symbols to not carry any specific meaning (so that a symbol like /{foo,bar} can be used as an OSC-path with alternatives, handled by the OSC-receiver).

porres commented 1 year ago

we can have both single curlies denoting some special syntax (like the lists-of-lists above) and curlies within symbols to not carry any specific meaning

That's what I was thinking. So, the thing is we'd need to use/implement single curlies first as this special syntax before also allowing other stuff?

umlaeute commented 1 year ago

yes, i think so. and before that, we need to agree on the actual use of single curlies (which is probably the hardest part)

Spacechild1 commented 1 year ago

Copy pasting my reply on pd-list:

the fact that curlies (that is: a pair of matching characters) have been forbidden until now is a unique opportunity. Well, they have not been forbidden. You can not type them (yet), but they can appear in text files or in manually assembled symbols. In fact, they are properly displayed with [print] or list/symbol atoms, which suggests that they are already properly escaped when sent to the GUI. One could argue that someone who only uses libpd might not even be aware of this issue.

Curly braces have never been part of the set of special Pd symbols (semicolon, colon, dollar), so I'm sceptical that we can just go ahead and give them some special meaning. It would break lots of existing code that uses symbols with curly braces. For example, it would completely break the "purest_json" library.

However, we could use escaped curly braces. The escape sequences "{" and "}" are not defined, so we might use those for nested lists. I'm not saying it's pretty, though...

umlaeute commented 1 year ago

For example, it would completely break the "purest_json" library.

would it? how so?

my proposal distinguished between curlies in symbols and standalone curlies (similar to how Pd distinguishes between commas in symbols and standalone commas).

curlies in symbols would just be treated literally, so a symbol {"id":42} would still be parsable by [json-decode] (and [json-encode] would still be allowed to create such symbols). otoh [10 { 1 2 3 } 20( could contain a 3 element list (with the 2nd element being a list itself), but creating the symbols { resp } (by arcane magic like [makefilename %c]) and feeding them into [10 $1 1 2 3 $2 20( would create a flat 7 element list (just like creating two ,-symbols and feeding them into the $arg expansion would create a 7 element list rather than the 3 messages of [10 , 1 2 3 , 20(.

Spacechild1 commented 1 year ago

curlies in symbols would just be treated literally, so a symbol {"id":42} would still be parsable by [json-decode] (and [json-encode] would still be allowed to create such symbols).

you're right that purest_json incidentally stores the whole JSON string in a single symbol, but there are other libraries/patches that use symbols with a single curly brace. Personally, I have abstractions that create Lua code, such as { foo = 123 \, bar = 567 }. These are valid Pd messages.

Again, curly braces have never been a reserved character in the Pd core, but we could safely use escaped curly braces.

Spacechild1 commented 1 year ago

(similar to how Pd distinguishes between commas in symbols and standalone commas).

That's not true, you cannot have an unescaped comma (or semicolon) in a symbol in a Pd message. [symbol foo,bar( gives you two messages (symbol foo and bar). [symbol foo,bar] creates a symbol foo.

Ant1r commented 1 year ago

use symbols with a single curly brace

Even if the curly brace was chosen to create nested lists, you would still be able to create such symbol by escaping the brace, like with the comma or space. If you want to create the list: { foo = 123 \, bar = 567 } you would have to type the message: [list \{ foo = 123 \, bar = 567 \}(.

reduzent commented 1 year ago

One could argue that the escaping is only necessary for a symbol atom containing nothing else but a curly brace character. Symbols containing curly braces and other characters wouldn't interfere with IOhannes' proposal.

reduzent commented 1 year ago

(similar to how Pd distinguishes between commas in symbols and standalone commas).

That's not true, you cannot have an unescaped comma (or semicolon) in a symbol in a Pd message. [symbol foo,bar( gives you two messages (symbol foo and bar). [symbol foo,bar] creates a symbol foo.

You can type foo,bar into a symbol box and you get a foo,bar symbol. However, it is printed in its FUDI encoded form symbol foo\,bar. Nevertheless, the presentation / decoded form is `foo,bar'. When writing message boxes or object boxes, you obviously need to write symbols in their FUDI encoded form.

reduzent commented 1 year ago

Again, curly braces have never been a reserved character in the Pd core, but we could safely use escaped curly braces.

I find that a really awkward proposal and goes contrary to well established principles. Usually, escaping is used to escape the special meaning of a character, not vice versa. As long as FUDI-specific special characters are escaped during FUDI-encoding, there should be no issue, even with existing externals.

Spacechild1 commented 1 year ago

If you want to create the list: { foo = 123 \, bar = 567 }

The point is, this list is already a valid Pd message! { and } do not have any special meaning in Pd. (They only need to be escaped when sent back to the Tcl/Tk GUI). Changing the meaning of these characters would break existing patches.

I find that a really awkward proposal and goes contrary to well established principles. Usually, escaping is used to escape the special meaning of a character, not vice versa.

I agree that it's not pretty. I conceded that in my first reply. However, it is the only way to do this without breaking existing patches.

Also, @umlaeute's proposal would go against well established principles as well. No FUDI-specific character in a Pd messages cares whether it is surrounded by whitespace. foo bar,baz 0 is the same as foo bar , baz 0. So I do not see why foo { bar } should behave differenty from foo{bar}.

As long as FUDI-specific special characters are escaped during FUDI-encoding, there should be no issue, even with existing externals.

But { and } are not FUDI-specific characters!

reduzent commented 1 year ago

However, it is the only way to do this without breaking existing patches.

What patches? Either they've been modified with external (to Pd) editors or they don't use any curly braces. Patches that receive curly braces from externals sources ([pdlua], network, serial, file, etc.) shouldn't break, assuming the curly braces would be escaped then.

As long as FUDI-specific special characters are escaped during FUDI-encoding, there should be no issue, even with existing externals.

But { and } are not FUDI-specific characters!

Fair point. Considering the Pd "language" they've been non-existent and thus I think it's worthwhile thinking about introducing them now.

Maybe I missed it before, but can you give an exact example of what would break?

Spacechild1 commented 1 year ago

What patches? Either they've been modified with external (to Pd) editors or they don't use any curly braces.

Pd messages do not have to be typed manually, they can be generated with list operations, read from a file, received from a network, etc.

Here's a small example that would break with @umlaeute's proposal: json-test.zip

It is really simple. Something like { foo = 123 \, bar = 567 } is already a valid Pd message. Changing the meaning of { or } changes the meaning of the message.

Considering the Pd "language" they've been non-existent and thus I think it's worthwhile thinking about introducing them now.

That's not true, either. Again, curly braces have never been a special character in Pd messages. The only reason why you can not type curly braces is that Pd did not have a proper escape mechanism for sending Pd messages between the core and the GUI. If such a mechanism had been in place from the beginning, there would have never been a limitation on the use of curly braces. So it is not a Pd language limitation at all, only a technical hindrance in the GUI implementation.

umlaeute commented 1 year ago

Here's a small example that would break with @umlaeute's proposal:

does it necessarily?

the curlies in the patch are always symbols, never special characters; so they should be fine.

reduzent commented 1 year ago

Pd messages do not have to be typed manually, they can be generated with list operations, read from a file, received from a network, etc.

Exactly my point. Curly braces from external sources would be treated as literal curly braces by escaping them (as it is the case for commas, semicolons, whitespaces). [125( -> [list tosymbol] would create a symbol \}, accordingly.

Spacechild1 commented 1 year ago

the curlies in the patch are always symbols, never special characters; so they should be fine.

Ah, right. Now imagine I replace [l2s] -> [list fromsymbol] with [fudiformat]; or I send the message over the network and the other end does [fudiparse]; or I write the Pd list to a file and someone else reads it back in.

If I understand correctly, what you imagine is that binbuf_gettext() would start to escape curly braces in symbols and binbuf_text() would interpret "isolated" curly braces as something like A_CURLY. This only works as long as all parties involved (including existing patches and data) agree on the new protocol.

reduzent commented 1 year ago

This only works as long as all parties involved agree on the new protocol.

Which parties? I'm not questioning, rather curious.

Spacechild1 commented 1 year ago

Let's take my example patch and assume that I write the resulting Pd list { "foo" = 123 \, "bar" = 5 } to a file. Now, 1 year in the future the meaning of (isolated) curly braces has changed. If I try to load this file with [text], I do not get the same result. The curly braces in the message are now parsed as special atoms (e.g. A_CURLY) and not as symbols.

Or see the [fudiformat] or [fudiparse] example above.

Generally, you cannot take an existing protocol, change the meaning of certain characters (outside escape sequences) and expect no breakage. That's just not possible.

reduzent commented 1 year ago

Generally, you cannot take an existing protocol, change the meaning of certain characters (outside escape sequences) and expect no breakage. That's just not possible.

Valid point. Maybe this could be addressed with a compat flag? /me ducks from IOhannes...

I find the idea of multidimensional lists very intriguing and wouldn't want to bury it too lightly.

Spacechild1 commented 1 year ago

I find the idea of multidimensional lists very intriguing

Me too!

wouldn't want to bury it too lightly.

Another possibility would be to use the dollar sign. There have been discussions about introducing $@ for "all (creation) arguments". Similary, we could introduce ${ and $} to denote the start/end of a nested list.

Now, technically - and unfortunately - you do not have to escape a dollar sign if the following character is not a number. foo$bar is the same as foo\$bar. The former is certainly bad practice and I'm not sure if this is formally documented, or just an implementation detail. (Backslashes, on the other hand, are discarded if the following character is not part of an escape sequence.)

What we could do is explicitly disallow the use of unescaped literal dollar signs to open up the possibility for future extensions of the Pd syntax. Of course, this might be disabled with a compatibility flag :-)

porres commented 1 year ago

Another possibility would be to use the dollar sign. There have been discussions about introducing $@ for "all (creation) arguments". Similary, we could introduce ${ and $} to denote the start/end of a nested list.

Yeah, the tricks to use curly braces has already been used by power users and Pd can receive and deal with them when receiving messages over the network. I can confirm that sending /{foo,bar} from SuperCollider to Pd via OSC works fine and routeOSC can deal with it to match foo and bar.

It's weird that "{" on its own would have a special syntax meaning but not as part of symbols. It makes sense to me to take advantage of this existing special character and use it to expand it for possible syntax expansion.

EDIT: PlugData and PurrData/Pd-L2ork already allow {} and generate patches compatible to Vanilla.

Spacechild1 commented 1 year ago

Oh, and you can already type curly braces in text windows. I totally forgot about this.

porres commented 1 year ago

Oh, and you can already type curly braces in text windows. I totally forgot about this.

Yeah, we missed this here, but I mentioned it in the recent discussion on the pd-list

umlaeute commented 1 year ago

how about [list flatten] and [list unflatten], that convert between nested lists and flat lists (with { and } being ordinary symbols)?

porres commented 1 year ago

I'd need to see that in action so I get it better, but it sounds nice :)

umlaeute commented 1 year ago

i was going to say, that [text] allows you to enter $1 and get a literal symbol "$1".

but it seems it is much more complicated and convoluted: you do get an A_DOLLAR (resp A_DOLLSYM) and sometimes warnings.

i wonder how you could use such an A_DOLLAR retrieved from [text].

umlaeute commented 1 year ago

conversely, you cannot receive $1 via [netreceive]. that looks broken to me as well.

if nobody can come up a use case for [text] outputting A_DOLLAR, probably [netreceive] and [text] should just output literals containing $ (for my curly-suggestion that would also mean, that they would output literals containing {})

Spacechild1 commented 1 year ago

i wonder how you could use such an A_DOLLAR retrieved from [text].

Dollars/dollarsyms in [text] have special meaning, they can be set dynamically with the [text sequence] object:

grafik

you do get an A_DOLLAR (resp A_DOLLSYM) and sometimes warnings.

That's a bug! If lines with dollars/dollarsyms are obtained with [text get], the dollars/dollarsyms should be bashed to symbols.

umlaeute commented 1 year ago

ah yes. thx.

Spacechild1 commented 1 year ago

conversely, you cannot receive $1 via [netreceive]. that looks broken to me as well.

What is broken about it? Currently you get a warning (netreceive: got dollar sign in message). Another option would be to bash dollars/dollarsym to symbols. Both seem equally fine to me. What else could [netreceive] do with actual dollars/dollarsyms? Replace them with the current canvas arguments? Maybe, but I don't really see a potential use case for this.

umlaeute commented 1 year ago

i was thinking in the context of [text get] which outputs the $args (in a somewhat buggy way).

i agree that just refusing dollar signs in messages is better than what [text get] currently does. but if the correct behaviour of [text get] is to bash the dollarg to a normal symbol, then [netreceive] (and [fudiparse]) should probably do that as well...