Raku / problem-solving

🦋 Problem Solving, a repo for handling problems that require review, deliberation and possibly debate
Artistic License 2.0
70 stars 16 forks source link

*..'foo' scenarios create surprisingly inconsistent, invalid Ranges #405

Open 2colours opened 1 year ago

2colours commented 1 year ago

This is the design-level escalation of what I believed to be a mere Rakudo bug. In order to support *..'foo' and 'foo'..* kind of Ranges as "anything up until 'foo'" and "anything starting from 'foo'", -Inf and Inf where used as the dummy/sentinel values. The elephant in the room with this is that -Inf..'foo' or 'foo'..Inf aren'tisn't even valid Ranges to be constructed directly, hence this is a fundamental contradiction of any interpretation of "surprisingly consistent".

I think this is related to the broader topic that has been touched in https://github.com/Raku/problem-solving/issues/354 as well. I cannot re-open it but I don't think there was any need to close it in the first place...

Besides the issue I opened, actually, I implemented a supposed fix as well, based on the assumption that the Whatever star ought to be transparently +-Inf all the time, and therefore throw the same error at the very least.

However, it turns out that the "anything before", "anything after" semantics is already specced, see for example https://github.com/Raku/roast/blob/9f04c24247ceddb6220414e429f17bb2afdf1a77/S03-operators/range.t#L358 https://github.com/Raku/roast/blob/9f04c24247ceddb6220414e429f17bb2afdf1a77/S03-operators/range-basic.t#L160. I think all of this is actually "fair enough" and respectable - however, the way it is implemented (ie. just slap a +-Inf there as the endpoint) is neither reasonable, nor specified.

Returning to the topic already touched at https://github.com/Raku/problem-solving/issues/354, using Num values as catch-all minimum and maximum values is not type safe to begin with but here it's not just a theoretical concern: -Inf..'foo' Range creation actually throws, 'foo'..Inf succeeds, and none of this is specified. It simply happens because the first argument will decide the type. There is a whole can of worms here: 5 .. 'foo' would also fail and 'foo' .. 5 would also succeed.

Anyway, -Inf being a Num, it makes sense that -Inf..'foo' will do the same thing as 5e0..'foo' would. What really doesn't make sense is that you get to create the former Range anyway, if you use *..'foo'.

lizmat commented 1 year ago

So, if I get this right, you'd want * .. "foo" to actually store a Whatever type object as the start-point, and "bar .. *" to actually store a Whatever type object as the endpoint, rather than ±Inf?

2colours commented 1 year ago

Disclaimer: pretty please, don't double down on Inf being a value that gets special-cased everywhere and gets compared to everything despite definitely being a Num, only to make a point about it... I don't know if such a special case would fit into "surprisingly consistent" but it definitely wouldn't fit into a sane type system.

There are two things I think are addressable (and to be addressed) with this issue:

For the former, I think this is one more case for "magic values" like Least or Most that have a very broad type (potentially outright Mu but at least Any); that could help with min and max on empty iterables as well. These values could be coercible to -Inf and Inf respectively, in fact, I think that would only make sense, and then one wouldn't have to worry about the numeric case at all, but the Inf values would get to keep type safety.

For the latter, well, it's not news that Roast needs more tests... I still think it's unsalvageable as specification but adding tests for what shouldn't work, or under what conditions should something work, is probably beneficial either way.

lizmat commented 1 year ago

Whatever happened to a simple Yes / No. Or even start a response with that?

2colours commented 1 year ago

It wasn't a response, it was an elaborate comment that I started writing right away. The policy is to not propose solutions in issue openers, if I remember correctly.

Having said that, I hope it answers the question you posted so fast that I couldn't even keep track of it.

lizmat commented 1 year ago

I have X hours of the day to spend on Raku things.

I've come to the point that I will basically not read anything that you write because it is simply taking up too much of my time / energy.

it was an elaborate comment

Please try to be less elaborate. Please. Pretty Please.

2colours commented 1 year ago

If you don't have time to understand the situation, please really just don't read it. I don't write them for you personally. It takes effort for me as well, and right now by having this conversation, we just pollute the issue.

raiph commented 12 months ago

this is a fundamental contradiction of any interpretation of "surprisingly consistent".

So my interpretation doesn't count?

2colours commented 12 months ago

Go ahead.

librasteve commented 12 months ago

I think the "spec" (ie roast links) references a limited intent for Range to support Str endpoints that can be coerced to Numeric (eg. '7') as in -Inf..'7' and are not originally (or currently) intended to support lexicographic Ranges.

There is (accidentally?) a limited utility to open ended positive Ranges such as ['foo'..*][7] #'fov' but I think that this is really stretching the application of Rangel.

±Inf is a solid choice for Numeric use cases of Ranges (ie. it interoperates to all Numeric types via cmp and coercions) and behaves sensibly under arithmetic operations (+-/*)

I think that the idea of an untyped Least and Most value is troubling ...

So this topic seems to be undirected wishful thinking tbh.

2colours commented 12 months ago

I think what you are saying is respectable and it makes sense on its own; I could support it in "a different world" where Raku is overall a different language. However, it feels you are mostly arguing with the existing design itself.

There is (accidentally?) a limited utility to open ended positive Ranges such as ['foo'..*][7] #'fov' but I think that this is really stretching the application of Rangel.

First off, this is clearly in Roast, whatever the reason is. Second, in the light of https://github.com/Raku/problem-solving/issues/354, I rather think this is deliberate and therefore when you argue against something like "Most" and "Least", you are even more arguing against this shallow status quo where Inf and -Inf tries to play the extremal values for basically all types, regardless that they are Nums. "My" proposal (not the first one to come up with this idea) really points to the direction you want, except less strict.

For the "pain points" you collected:

niner commented 12 months ago

FWIW my gut says that the Least/Most suggestion has a lot of merit. But then I deliberately keep out of language design discussions because they require a lot of thought and time. More so than with implementation, with design you have to try to shoot down your own idea. Try to poke holes and find the cases where it breaks and use that then to judge your design.

So @librasteve's questions are very helpful. Those are cases that need good answers. They could for example be that there is no multi candidate for infix:<+> that takes a Least/Most and thus Most + 1 won't even compile.

The Least/Most suggestion would also require that we no longer just take the left side of the range to determine the type and instead be smarter. Of course the same road could be taken even with Inf and -Inf, so we'll have to test whether this smaller change to the language would already solve the problems we identified.

2colours commented 11 months ago

FWIW my gut says that the Least/Most suggestion has a lot of merit. But then I deliberately keep out of language design discussions because they require a lot of thought and time. More so than with implementation, with design you have to try to shoot down your own idea. Try to poke holes and find the cases where it breaks and use that then to judge your design.

What you are saying makes a lot of sense and it can be respected if somebody genuinely thinks time can be better spent. However, please keep in mind that if you don't speak up for your own vision, you basically pass this opportunity onto people who are willing to (over)represent themselves. And as much as I try to speak up, I feel a lot of times my ideas aren't "shot down" for well-argued reasons but simply because the people who are willing to speak up have strong prejudices, and actually never try to "shoot down" the existing problems...

They could for example be that there is no multi candidate for infix:<+> that takes a Least/Most and thus Most + 1 won't even compile.

That could also make sense. Usually there isn't one single right choice, especially for Raku that kind of has precedents for everything. The important thing is to apply a decision consistently.

The Least/Most suggestion would also require that we no longer just take the left side of the range to determine the type and instead be smarter. Of course the same road could be taken even with Inf and -Inf, so we'll have to test whether this smaller change to the language would already solve the problems we identified.

I think it's very important to define "the issue". Unfortunately it's quite common that there are multiple possible issues at once.

The issue I opened is about *..'foo' producing a -Inf..'foo' range while -Inf..'foo' directly produces a coercion error for 'foo'. I also have the presumption that -Inf..'foo' producing an error is fair, which leads me to two thoughts:

It could be a different issue that -Inf..'foo' and 'foo'..Inf don't even treat types the same way.