PRQL / prql

PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
https://prql-lang.org
Apache License 2.0
9.92k stars 217 forks source link

Should `switch` have a control value? #1795

Closed max-sixty closed 1 year ago

max-sixty commented 1 year ago

From https://github.com/PRQL/prql/issues/1278#issuecomment-1426890693

I actually think you're right about the switch statement. I'm not sure why I missed that before. I think we were going between switch and match and case, and deciding whether there was a control value, and we ended up here. Note that SQL uses case like we use switch, though other languages use case as part of a traditional switch statement.

We did spend a lot of time discussing this, and we've spent a lot of time discussing the syntax (!), so maybe folks' patience for more discussions is waning. But I would propose we either:

Note that switch is still experimental, so it's OK to make changes here. I'm less of a fan of the if / else here fwiw, though there are reasonable alternative constructions of if & else)

eitsupi commented 1 year ago

FYI, dplyr has recently (1.1.0) introduced .default option that sets the default value for case_when().

Previous style

x <- 1:70
case_when(
  x %% 35 == 0 ~ "fizz buzz",
  x %% 5 == 0 ~ "fizz",
  x %% 7 == 0 ~ "buzz",
  TRUE ~ as.character(x)
)

Now

x <- 1:70
case_when(
  x %% 35 == 0 ~ "fizz buzz",
  x %% 5 == 0 ~ "fizz",
  x %% 7 == 0 ~ "buzz",
  .default = as.character(x)
)

Changelog says:

Has a new .default argument that is intended to replace usage of TRUE ~ default_value as a more explicit and readable way to specify a default value. In the future, we will deprecate the unsafe recycling of the LHS inputs that allows TRUE ~ to work, so we encourage you to switch to using .default.

I too was confused at first to catch conditions by true that did not fit any of other conditions. So I prefer _ over true. (However, I do not have a strong opinion. These are the opinions of a (former) programming novice.)

aljazerzen commented 1 year ago

To be fair, SQL supports both CASE with and without control value:

SELECT
  CASE x
    WHEN 'a' THEN 1
    WHEN 'b' THEN 2
    ELSE 3
  END,
  CASE
    WHEN x = 'a' THEN 1
    WHEN x = 'b' THEN 2
    ELSE 3
  END

I actually like current behavior because:

And I do think it is not that hard to understand from the documentation.

[^1]: yeah, this is very subjective

max-sixty commented 1 year ago

To be fair, SQL supports both CASE with and without control value:

But does switch exist anywhere without a control value?

My narrow-but-strong-ish-concern is that we're using something in an important-but-subtley different way to how it's used everywhere else.

I'd be fine if we called this case tbc!

richb-hanover commented 1 year ago

I am not certain of what you mean by a control variable, but I would like to express the behavior of the switch statement that I used as an example in https://github.com/PRQL/prql/issues/1922 and as implemented in the 0.5.2 Playground.

My specific hope is that the result of the switch can be based on an arbitrary expression for each "case". A simplified version of that example would be:

from foo
select [
  x = switch [
  a == b -> 1,
  a != b and a < 500000 -> 2,
  a != b and a >= 500000 -> 3,
  ],
]

NB: I am indifferent to the choice of -> or => Either works fine for me. Thanks.

max-sixty commented 1 year ago

I am not certain of what you mean by a control variable

It's the n in this C code (from https://iq.opengenus.org/switch-case-control-statement/):

switch (n)
{
    case 1: // code to be executed if n = 1;
        break;
    case 2: // code to be executed if n = 2;
        break;
    default: // code to be executed if n doesn't match any cases
}

A simplified version of that example would be:

To confirm, this works now, right?

richb-hanover commented 1 year ago

Yes, both the example from #1922 and the simplified example above work as desired.

Is there any change to switch syntax planned (besides the change to =>)? Thanks.

max-sixty commented 1 year ago

Is there any change to switch syntax planned (besides the change to =>)? Thanks.

The only discussion I'm aware of is this discussion around whether it should have a control value / should be named switch

richb-hanover commented 1 year ago

It sounds to me as if this issue is still unresolved. If so, I hope following is helpful. (And if not, I apologize for beating a dead horse).

Thanks for listening.

snth commented 1 year ago

I don't feel too strongly about the name but if switch is confusing compared to the use in other languages then I am happy to change this to case. Another advantage of case is that it is more similar to SQL which I am also in favour of.


I am strongly against switch/case taking a control value. I think this is demonstrated @richb-hanover 's example above as well as an example I gave in the original discussion on this (couldn't find it right now but might edit the comment to include the link when I do). SQL's case doesn't have a control value (at least some of the time) so I think we would be giving up generality for no good reason.

If we do want a control value variant (for brevity presumably) then we should look at introducing a match construct which I think was also discussed in the original discussion.

aljazerzen commented 1 year ago

On the call we decided not the change any behavior here, but to change the keyword to case to be consistent with SQL.

richb-hanover commented 1 year ago

If I understand correctly:

True? (I support this.) Thanks.

from foo
select [
  x = case [
  a == b => 1,
  a != b and a < 500000 => 2,
  a != b and a >= 500000 => 3,
  ],
]