metaeducation / rebol-issues

6 stars 1 forks source link

Make SERIES/#INDEX path forms act as PICKZ and POKEZ on the value in INDEX #2182

Open rebolbot opened 10 years ago

rebolbot commented 10 years ago

Submitted by: fork

Different ANY-WORD! types have variant behavior when applied via path. If the series is a FILE! or a URL!, then the word is appended...with the structural path separator turned into an implied "slash" character in the data as well:

>> foo: %hello

>> result: foo/WorlD

>> type? result
== file!

>> probe result
== %hello/WorlD

(Note: for the curious, if it were an ISSUE!, the ISSUE!-ness is discarded currently:

>> foo/#WorlD
== %hello/world

...which seems pretty random to me.)

This perhaps surprising behavior is mentioned in #2178, and it leads to a large philosophical question about the spelling of words...and their case-sensitivity..."leaking" in this way. Philosophically I feel I'm close to a sort of resolution about it; though the implications are fairly large.

Block types use the word to SELECT, where a WORD will match an ANY-WORD! type with the same spelling:

>> data: [foo bar: #baz :mumble 'frotz]

>> data/foo
== bar:

>> data/bar
== #baz

>> data/baz
== :mumble

>> data/mumble
== 'frotz

This is notably not true when you use an ISSUE! word. It will only consider a key match if it finds an issue:

>> data/#baz
== :mumble

>> data/#foo
== none

(Note: That's odd in particular because select acts differently:

>> select data #baz 
== :mumble

>> select data #foo 
== bar:

...this hints that maybe a SELECT/STRICT is necessary?)

But in a sense it seems that when paths are evaluated and an ISSUE! is in the slot, that behavior is kind of a "waste". We do not see GET-WORD! wasted in this way.

>> something: 'baz
== baz

>> data/:something
== :mumble ;-- something is dereferenced to get its value, then data/baz is looked up

Rebol 3 made the decision that ISSUE! should be an ANY-WORD! instead of an ANY-STRING! as it was in Rebol2, due to the necessity of another word type vs. another string type...as well as noticing how it was typically being used in practice. That opens up a potentially much more interesting application of ISSUE in path processing.

What if the #something first does an evaluation just as :something would, and if SOMETHING comes back an integer...it were interpreted using the zero-based continuous interpretation as described in #613, and part of the "great indexing compromise"?

>> data: [foo bar: #baz :mumble 'frotz]

>> data: next next data
== [#baz :mumble 'frotz]

>> index: 1

>> data/:index  ;-- equivalent to data/(index) in all examples
== #baz

>> data/#index
== :mumble

>> index: 0

>> data/:index
** Error zero is not a valid index when using one-based indexing

>> data/#index
== #baz

>> index: -1

>> data/:index
== bar:

>> data/#index
== bar:

This does not solve all the cases in which you would want to do zero-based indexing, such as when more complex formulas are involved. Yet I suspect it could cover many of the most common cases. One can consider getting "wacky" to try and enable a paren-equivalent somehow...but the weirdness of things like:

     data/#/(your + formula - here)

Is probably unwarranted when you could write:

     (tmp: your + formula - here) data/#tmp

Just something to consider in the "simplicity is king" mindset.

Such a thing may relegate PICKZ and POKEZ to being used rarely enough to give them more systematic names like PICK/ZERO and POKE/ZERO. That would leave abbreviations to those who want them in some "standard abbreviations" include file.

I should point out that there is a parser issue under the current system. Colons are currently legal in words even when construction syntax is not used.

>> foo: quote data/#index:
== data/#index:

>> type? second foo
== issue!

>> probe second foo
#index:

There are some good reasons to not allow colons in arbitrary words anyway. Namely, because you're not getting what you think you are...

>> a:b: 10

>> print a:b
a:b

>> type? a:b
== url!

So I think a general argument of "you really should use construction syntax to get words with internal colons, of any word type" could be easily made.

This also raises the question of whether LIT-WORD! or SET-WORD! are being used effectively. Might they be applied intelligently? How about LIT-WORD! being used for the "strict" matching in select, with dereference?

>> data: [foo bar: #baz :mumble 'frotz]

>> something: quote :mumble

>> data/'something
== 'frotz

>> something: quote mumble:

>> data/'something
== none

If something were a string, maybe it would be case-sensitive? I don't know about that, though. Because then, what if you had a word type and you wanted it to be looked up and matching the precise word subclass, and be case-sensitive? Probably best not to mix it up too much in the world where FOO == fOo is true.

Parser-wise we know set-word isn't a good fit for the end of a path, because of foo/baz/bar:: ... but could-or-should it have meaning if used earlier? What might foo/baz:/bar mean, or foo:/baz/bar? That one I don't have any suggestions on TBH. But it might be worth leaving the parser open-minded about it in case someone has an idea.

CC - Data [ Version: r3 master Type: Issue Platform: All Category: Unspecified Reproduce: Always Fixed-in:none ]

rebolbot commented 10 years ago

Submitted by: fork

Note a central purpose here is resistance to seeing Red/System switch to zero-based indexing. This is being seriously considered, while leaving Red itself to use the "natural indexing" of Rebol2...or @earl even wants to switch them to a consistent scheme different from Rebol

https://github.com/red/Red/issues/264

If Red/System drifts too far from Red, then it starts becoming too much another language and the "fun" of using a "uniform substrate" is chipped away. It seems that if zero-based indexing were more painless across the board that would be a win-win and address the need which people were asking for in Rebol before a systems-level dialect existed. A compromise of this form, used to extend the existing "indexing compromise", may handle the majority of cases without breaking anything. Or a line of thinking built out from it.

I don't know if there is a problem with this proposal that I'm not seeing. But to me it seems like the kind of thinking that is needed to avoid what will amount to a future pain with a missed opportunity to keep coherence. If every time you switch from Red to Red/System you have to rethink your indexing, that seems like a "bad thing(tm)". And if you could zero-index more easily in Rebol, Red, and Red/System that seems like a "good thing(tm)".

ladislav commented 8 years ago

I am for zero indexing, but this may not just add a new functionality, but also destroy the existing one.