oils-for-unix / oils

Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
http://www.oilshell.org/
Other
2.84k stars 156 forks source link

ysh breaking: Replace 1 .. 5 range syntax with 1 ..< 5 half open and 1 ..= 5 closed range #2096

Open bar-g opened 5 days ago

bar-g commented 5 days ago

The implicit, not fully consistent, and unsymetrical inclusive/exclusive behavior of .. ranges (and slices) seems just asking for bugs.

The syntax ambiguity of .. is problematic:

ysh ysh-0.23.0$ for s in {1..2}; do echo $s; done
1
2
ysh ysh-0.23.0$ for s in (1..2) { echo $s }
1
ysh ysh-0.23.0$ 

After reading: https://www.reddit.com/r/ProgrammingLanguages/comments/x8xtou/asking_for_opinions_on_the_best_way_to_specify_an/

Maybe a solution like the following could work in ysh expressions as a consistent concession that still allows for short enough common cases: (Maybe the common spoken default meaning could be seen as usually being inclusive on the first (left) border side given?)

proposed ysh expression syntax description
.. forbidden (to produce helpful error message pointing to unambiguous syntax)
..= inclusive / inclusive (useful to clearly express ranges/selections)
..< inclusive / exclusive (half-open interval, often useful in repetitive algorithms
<..= exclusive / inclusive
<..< exclusive / exclusive
andychu commented 5 days ago

I have thought about the ..< and ..= change

Or .. and ..=

I'm not sure it's inconsistent though, I think 3 .. 5 means [3,4] in every language, or at least all languages that are zero-based. (Lua and R are rare 1-based languages, I think Julia too)

It's also consistent with slicing - a[3:5] or a.slice(3,5) -- these are known as half-open intervals and make most algorithms work. The other behavior is not as useful, but it could exist

bar-g commented 4 days ago

Or .. and ..=

With this combination, reading a .. in code would not clearly tell everybody what it means, though. And worse, using .. would still be inconsistent with the inclusive by default spoken language.

think 3 .. 5 means [3,4]

Well, that's exacltly the problem, assumed meanings, off by one (value and/or background).

Half-open intervals seem useful e.g. for iterating over them (like slicing a whole bread). Whereas generally specifying a range or slice in math or spoken language seems to rely on symmetric definitions by default.

Half-open intervals seem to have come as a shortcut in (overly) programming-centered programming languages, then they believed: 'one to two' is 'one' (instead of 'one and two') 'one to one' is 'none' (instead of 'one')

In general, the "implicitly half-open by default" seems error-prone and easily misleading. (Especially for shell user's that can't be assumed to be full time programmers.)

So, a possible explicit solution maybe?: 3..<4 and a.slice(3,<4)

bar-g commented 4 days ago
ysh ysh-0.23.0$ for s in {1..2}; do echo $s; done
1
2
ysh ysh-0.23.0$ for s in (1..2) { echo $s }
1
ysh ysh-0.23.0$ 
bar-g commented 3 days ago

I'm not sure it's inconsistent though, I think 3 .. 5 means [3,4]

With your feedback I've now reworked the orig. issue summary and added a list of inconsistencies. Please have look. I think the use of ..< in ysh for its current default behavior (different from .. in osh) could clear up all inconsistencies (besides allowing to add other consistent syntax).

andychu commented 3 days ago

Yeah I have noticed the inconsistency between shell's {1..2} and YSH 1 .. 2

(I think shell is a little broken because {1..$n} doesn't work. That construct is very limited as a result, and I almost never use it.)

But yes I can see that from a shell user's POV, this could be confusing/inconsistent. From a Python/JS POV, it's not

It is a valid point

andychu commented 3 days ago

Hm actually I discovered Ruby is consistent with shell, and inconsistent with Rust/JS/Python:

$ ruby -e '(3..5).each do |x| puts x end'
3
4
5

So yeah this is probably a good reason to change it

I think a stricter version of Swift and Rust would be good, with ..< and ..=

https://doc.rust-lang.org/reference/expressions/range-expr.html

We can force you to choose it, not provide .. and ..., just ..< and ..=


We have some ruby-like features, so it is probably good not to "annoy" people used to ruby (and shell of course)

bar-g commented 3 days ago

Yes, not to "annoy" non-shell users is just as important.

I think an idea that already came up, when I figured out a clean up path for the syntax and interaction (python users are accustomed to) around mutability/aliasing (https://github.com/oils-for-unix/oils/issues/1831), was having some strict_* options for ysh featues that users coming from python etc. may explicitly turn off (when jumping back and forth between languages, or to share code?), i.e. when cleaning up warts, gotchas, ambiguities that are also common and known in other languages.

If there really is a need for an option like strict_ranges here, I guess it could still be enabled by default in ysh. Because ysh could give helpful error messages when seeing an ambiguous .. or ..., and that would foster cleaner ysh code in general.

andychu commented 3 days ago

Well I am leaning toward just having ..< and ..=, and that's it

So there is no ambiguity for anyone, and no need for any options (which are necessary sometimes, but cause complexity)

That seems like the simplest approach

bar-g commented 3 days ago

Yes, I agree, no need for any strict_ option when .. gives an error here.

andychu commented 3 days ago

@PossiblyAShrub like this, so we should do it

It will be a breaking change, but we'll have an OILS-ERR hint for ..

bar-g commented 1 day ago

BTW: I had stumbled over the .. range problem when comparing Oils looping-performance in solving the "send+more=money" puzzle. (Now gist: https://gist.github.com/bar-g/3f7054a0da87621ec16baed1aa3bd661, i.e. run it with also Nim and Python installed).

Top results: Compiled Nim code took half a second here, Python 8sec, interpreted Nim(script) 18sec. While YSH took its 1min+18sec, and Bash over 4min.

I wonder why ysh is so much slower than python.

From the 2024 blog overview (https://www.oilshell.org/blog/2024/09/project-overview.html), what Nim is missing seems only a Command-language feature, and external-interface representing procs?

running nimversion...
9567 + 1085 = 10652

real    0m0,548s
user    0m0,543s
sys 0m0,004s

running pythonversion...
9567 + 1085 = 10652

real    0m7,766s
user    0m7,667s
sys 0m0,084s

running nimscriptversion...
Nimscript *interpreter* is runnig in 'Verbose' mode.
9567 + 1085 = 10652

real    0m18,151s
user    0m18,106s
sys 0m0,032s

running yshversion...
9567 + 1085 = 10652

real    1m18,438s
user    1m18,354s
sys 0m0,004s

running yshversion filtering solutions with subshell-pipe-commands...
9567 + 1085 = 10652

real    1m20,352s
user    1m20,378s
sys 0m0,429s

running bashversion...
9567 + 1085 = 10652

real    4m4,753s
user    3m58,791s
sys 0m5,638s

running bashversion filtering solutions with subshell-pipe-commands...
9567 + 1085 = 10652

real    4m7,178s
user    4m1,408s
sys 0m5,902s
bar-g commented 1 day ago

It will be a breaking change, but we'll have an OILS-ERR hint for ..

Hm, noticed you say "hint". As minimal error messages are printed, it might actually make good sense to refer to the optional texts as "OILS-HINT-xxx" [makes it also much more naturally to have them shipped in a separate -doc tarball/package (https://github.com/oils-for-unix/oils/issues/2095)].

andychu commented 2 hours ago

This is now done -- @PossiblyAShrub agreed with this feedback :-)

Will be out for the next release


I think some of them could be OILS-HINT, but it's not worth having 2 things, or worth changing now

The googling for OILS-ERR already works -- we tested it :)