Closed primo-ppcg closed 1 year ago
@narimiran: say what you want, but the Janet folks are pretty reasonable at times, eh? :)
@narimiran: say what you want, but the Janet folks are pretty reasonable at times, eh? :)
Hey there, long time no see :) Believe it or not: when I saw your name in the contributors' list, that's what finally convinced me I should give Janet a try.
Hopefully this change doesn't break too much code in the wild, but IMO it is well worth changing this.
While I don't personally mind rewriting my own code (if I have anything that is affected), if there is rewritten code, is there not an issue of how to handle running with older versions of Janet vs newer versions?
I suppose one might provide multiple versions of one's program that are tailored to different versions of Janet. Alternatively, one might also have code that examines janet/version
and acts accordingly.
Another option might be to avoid using negative indeces completely for a while.
Hey there, long time no see :) Believe it or not: when I saw your name in the contributors' list, that's what finally convinced me I should give Janet a try.
Yeah, as if I have a good taste with these kind of things. The last language I invested in ended up a total dumpster fire, remember.
Another option might be to avoid using negative indeces completely for a while.
I don't hate this. I expect this feature is not used often, as it seems there were no complaints about it earler, so my guess is that this would not affect too many code bases. I know I'm a bit of a whiner myself, and I would have spoken out if I would have needed to use -2
for the last index.
I agree that -2
is just a bit too weird, and I'd rather see this "fixed" sooner than later.
I use -2 and other negative indeces in some of my pegs with slice, like here and here.
But this will not have to be changed.
You can still use (slice $& -4 -2)
and it will do the same thing as before, capturing the same two elements. The only thing changed in slice
is its documentation: when using negative indices, the range is now explained as open-closed, making the index of the last element -1 (not closed-open, as before, with the last element as -2).
Before:
[10 20 30 40 50]
-6 -5 -4 -3 -2
[ ]
After:
[10 20 30 40 50]
-5 -4 -3 -2 -1
[ ]
The inability to remove stuff from the end of the array and the lack of symmetry is definitely a problem of a bad interface, I support this change, at least for array/insert
.
As far as making slices using -1 as the end of the array instead of the -2 issues, I agree that it is confusing and unfortunate (along with os/date changing indices to be 0-indexed), but I also believe that constantly changing interfaces that really don't have strong mathematical reasons behind and are more about "hygine" or "clean code" just causes churn.
On the other hand, if one were to make this change, sooner is always better than later to just "rip the bandaid off."
Perhaps such improvements could be collected and bundled together into one large backwards-incompatible patch?
I would recommend having a "2.0" branch for janet where all such changes can be accumulated until it's ready for release. It should be possible to rebase it on every point release, meaning people could "opt-in" early as well.
I would recommend having a "2.0" branch for janet where all such changes can be accumulated until it's ready for release. It should be possible to rebase it on every point release, meaning people could "opt-in" early as well.
Not a bad idea, but I think the more interesting question is how much should we consider changing in the first place. Lots of small changes could be made, but there are also larger changes that might be overall benefical but cause even more churn.
My gut tells me that changing slice semantics will break a lot more code that array/insert
and array/remove
, since for the usecase of inserting at the end of the array, array/push
and array/pop
is more natural. The indexing scheme where -1 refers to imaginary "element after the last one" also affects buffers and strings as well.
My gut tells me that changing slice semantics will break a lot more code that
array/insert
andarray/remove
, since for the usecase of inserting at the end of the array,array/push
andarray/pop
is more natural. The indexing scheme where -1 refers to imaginary "element after the last one" also affects buffers and strings as well.
I think there may be a bit of confusion. I haven't proposed changing the behaviors of array/slice
or array/insert
. I believe that they were already correct, but perhaps for the wrong reason. The current behavior has been explained differently, in terms of -1 as the final array index.
As an example, the following code snippets should be equivalent:
(var a @[:a :b :c :d])
(set a (reverse a))
(array/insert a 0 :e)
(set a (reverse a))
(var a @[:a :b :c :d])
(array/insert a -1 :e)
And they are, currently. However the reason given is that -1 is an index "outside" the array, and array/insert
inserts before that index. What I'm proposing is that this is not as correct as saying that negative indices view the array in reverse, such that inserting at a negative index would actually insert after that index, precisely as the first snippet does. I have likewise changed the description of array/slice
with that understanding.
The only function that I have proposed changing is array/remove
. Had array/insert
and array/remove
been consistent, I would not have recommended a change, despite the final index of -2 being somewhat weird, and I do believe unprecedented.
Negative indexing is a weird python-ism that can seem ubiquitous in scripting languages, but it comes with it's own can of worms, and a lot of languages just don't bother. I don't know if it comes from Python originally but it really is more of a hack than any sort of elegant construction, regardless of the particulars.
Also, the proposed explanation of array/insert and array/remove actually makes less sense to me now. After a call to array/insert, the array is one longer, the same negative index would return a different value. In that sense, the current implementation is more consistent.
EDIT: not to say that change is bad, just to say that the particular argument I don't think is very convincing.
I would like to point out that -1 mod n = n-1, making it the last element of an array. Like like, on a ring, -2 is the second to last, and so on. So, from a mathematical standpoint, at least to me, it makes sense. Also, adding my two cents to something related, one of my issues with 1-indexed arrays is how they interact with modular arithmetic. (Looking at you Julia!)
Getting slightly off topic here, I did a little experiment with redoing negative indexing for slices here: https://github.com/janet-lang/janet/compare/master...negative-indexing-redo
It basically breaks almost all reasonably sized projects I can find written in Janet (not surprisingly).
I did a little experiment with redoing negative indexing for slices here: master...negative-indexing-redo
Of the functions that refer to negative indexing, which includes:
array/insert
array/remove
array/slice
buffer/blit
buffer/slice
keyword/slice
slice
string/slice
symbol/slice
tuple/slice
the implementations of all of these are already correct and do not need to be changed, with exception to array/remove
. And surely if they hadn't been correct, it would have been mentioned long before now. This PR is Ready to Merge; no further changes are required. If it hadn't been, I wouldn't have marked it as such.
Also, the proposed explanation of array/insert and array/remove actually makes less sense to me now.
I would like you to imagine a bijective one-dimensional coordinate space translation, x ↦ x', such that 0 ↦ -1, 1 ↦ -2, 2 ↦ -3, etc., and that parallel functions are defined which reference array indices according to this translation. When using these functions, arrays would be indexed in the following way:
#x' -1 -2 -3 -4 -5
@[ :a :b :c :d :e ]
When inserting an item at index -2, the item would be inserted at index -2, shifting the contents and everything after by one cell. The position of -2 will not have changed. If the contents at -2 are examined, it would be precisely what was inserted. Removing it again would result in the original array.
When indexing from the end of an array, this same bijection is used, except that the items of the array are enumerated in reverse order:
# 0 1 2 3 4
@[ :a :b :c :d :e ]
# -1 -2 -3 -4 -5
@[ :e :d :c :b :a ]
such that inserting at -2 would insert between :e
and :d
, such that slicing from -2 to -4 would contain :d
and :c
, but not :b
, and such that removing at -2 would remove :d
.
Note that when referencing a range of cells, as slicing does, or a gap between cells, as inserting does, the behavior is precisely the same as always viewing the array from the front, and pretending that the final index is -2. Janet already has a correct implementation for these; nothing needs to change.
It only makes a difference when referring to a cell location directly, which currently only applies to array/remove
, although fetching from a negative index for indexed or bytes types could theoretically be allowed.
Getting slightly off topic here, I did a little experiment with redoing negative indexing for slices here: https://github.com/janet-lang/janet/compare/master...negative-indexing-redo
You changed things that don't need change and shouldn't be changed, i.e. you've changed their behavior and introduced several off-by-one bugs.
@primo-ppcg This makes sense.
Just to be clear, I will not redo negative indexing for slicing, I just wanted to have a more concrete understanding of how much it was actually used
So may be the following code in mendoza is affected?
(defn relative-url
"Express a url relative to (dyn :url). The given url should
be absolute from the site root, like /index.html."
[url]
(def b-parts (string/split "/" (dyn :url)))
(array/remove b-parts -2) # the form on this line
(def a-parts (string/split "/" url))
(while (and (not= 0 (length a-parts))
(not= 0 (length b-parts))
(= (a-parts 0) (b-parts 0)))
(array/remove a-parts 0)
(array/remove b-parts 0))
(string (string/repeat "../" (length b-parts)) (string/join a-parts "/")))
May be this can be re-expressed in a way to not use a negative index?
I guess array/pop
might do.
I'm posting this here rather than in the mendoza repository issues because I think more interested parties (issue participants) will see it.
Related issue: #1219
The problem with array index -1: Because
array/slice
does not include its end index in the result, it is not possible to slice to the end of the array. Also, becausearray/insert
inserts before the given index, it is not possible to insert to the end of the array.There are two possible solutions to this problem:
array/slice
andarray/insert
for negative indices.Notably, the behavior of these two functions is identical for either solution.
I submit for consideration that the proposed solution is more correct and desirable. When indexing from the end of the array, the array should be viewed reversed, as if its end were its front. In other words, any action that would have occurred before the index should occur after the index, and vice versa.
Furthermore, the current solution breaks the symmetry between
array/insert
andarray/remove
:Current solution:
Proposed solution:
Inserting at an index, and then immediately removing at the same index should certainly be a no-op.