metaeducation / rebol-issues

6 stars 1 forks source link

to-tag and embedded strings in {} #585

Closed rebolbot closed 8 years ago

rebolbot commented 15 years ago

Submitted by: Sunanda

I just mention this in case it uncovers some string parsing problems.

This is treated as a malformed tag -- ie a single quote within braces:

bc.. >> <{"}> * Syntax error: Invalid "tag" -- "<" * Near: (line 1) <{"}>

p.

However, it is possible to create that precise tag via other routes:

bc.. >> next <"{"}> == <{"}>

to-tag {{"}} == <{"}>

p.

That suggests to me that the console parsing is different to whatever goes on internally.

This issue is the same in R2.

CC - Data [ Version: alpha 32 Type: Bug Platform: All Category: Syntax Reproduce: Always Fixed-in:none ]

rebolbot commented 15 years ago

Submitted by: Sunanda

Brian, if you are looking at tightening the rules for tags, please note this issue too (true in R2 and R3):

bc.. ;; A tag can't start with a space -- that's probably ;; the correct behavior:

< bad> * Syntax Error: Invalid word -- bad> * Near: (line 1) < bad>

p.

bc.. ;; But we can create such a tag in several ways:

reverse == < bad> head insert " bad" == < badgood>

p.

bc.. ;; Yet we can't do anything much with it -- it ;; won't survive load+mold:

p.

bc.. >> load mold/all reverse * Syntax Error: Invalid word -- bad> * Near: (line 1) < bad>

p.

Following closer to HTML/XML rules would be better and less error prone.

rebolbot commented 13 years ago

Submitted by: BrianH

My initial comment was completely wrong.

"That suggests to me that the console parsing is different to whatever goes on internally."
>> mold/all next <"{"}>
== {#[tag! {"{"}} 2]}

Internally, tags are just strings, and there are no angle brackets. Like other string types you can have offset references, and there aren't really any restrictions on the string contents. Unlike some string types, tags don't have any support for escaping characters aside from quoting them in double-quotes, and no support for escaping double-quotes themselves. There is no alternate syntax for specifying strings with { } in tags, nor should there be; { and } are just treated like other characters. This means that you can construct some tags in memory that can't be loaded when specified in syntax. This is not a bug, just a limitation. As with similar limitations with many other datatypes, you can get around it with MOLD/all.

Changing the tag syntax to match HTML/XML rules would be a bad idea because tags aren't necessarily HTML or XML. Many people use REBOL to generate code for other (more-limited) programming languages, and many of these languages use tag-like syntax that doesn't match the HTML or XML tag syntax, such as <? ?> or even doctypes. Limiting ourselves to HTML/XML tag syntax would limit REBOL's capabilities.

However, it might be a good idea to borrow one idea from HTML/XML and also allow the inner strings to be specified with single quotes, with the double quote not treated specially inside. This would deal with the double quote escaping thing in the same way that other languages that use tag-like syntax do. This request would be a good subject for another ticket (#1873).

This ticket though is not really a bug, and should be dismissed. (This is the consensus of a long discussion on AltME about this subject.)