rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98k stars 12.68k forks source link

Tracking issue for `illegal_floating_point_literal_pattern` compatibility lint #41620

Closed est31 closed 8 months ago

est31 commented 7 years ago

This is a tracking issue for the compatibility lint disallowing floating point literals in patterns.

The corresponding RFC is RFC 1445. Originally, #36890 was the only tracking issue for floats in patterns, but after it was found out that the implementation of that lint doesn't cover literals (issue #41255), another lint and tracking issue were required.

The goal of this lint is to make code like this a hard error:

fn main() {
    let x = 13.4;

    match x {
        1.0 => println!("one"),
        22.4 => println!("two"),
        3.67 => println!("three"),
        13.4 => println!("thirteen point four"),
        _ => println!("anything"),
    }
}
SimonSapin commented 7 years ago

RFC 1445 is about const constants. Why are float literals restricted as well?

SimonSapin commented 7 years ago

This is a breaking change (rejecting previously-accepted programs) but as far as I can tell this one is not part of an accepted RFC.

SimonSapin commented 7 years ago

Is this a duplicate of #36890?

est31 commented 7 years ago

RFC 1445 is about const constants. Why are float literals restricted as well?

Hmm a quick glance over the RFC seemed to confirm this. Let's discuss this in #36890, okay?

Is this a duplicate of #36890?

No, its separate. There are two separate lints. This one is about literals, the one tracked in that issue is about constants. I have opened the PR #41293 after finding out about #41255.

withoutboats commented 7 years ago

RFC 1445 is about const constants. Why are float literals restricted as well?

From my perspective, primitive literals (as opposed to destructuring struct literals) are a kind of constant. The same questions raised by consts in that RFC are raised by primitive literals.

SimonSapin commented 7 years ago

No, its separate. There are two separate lints. This one is about literals, the one tracked in that issue is about constants.

Yet the example given in #36890 uses literals.

From my perspective, primitive literals (as opposed to destructuring struct literals) are a kind of constant.

Given that we have a language keyword named const, I think it is reasonable to assume that constants are cont items and nothing else. If a different definition of the term is intended, it should absolutely be defined explicitly. This is not the case in this RFC.

est31 commented 7 years ago

No, its separate. There are two separate lints. This one is about literals, the one tracked in that issue is about constants.

Yet the example given in #36890 uses literals.

The example quoted in the tracking issue you linked never was implemented that way. I have asked about whether to create a separate lint or add the check to the existing one, and it was decided to create a separate lint.

I guess the example should be removed from that issue's description.

nikomatsakis commented 7 years ago

I would definitely say that the RFC was intended to apply equally to all constants, whether they are literals or not. Clearly there is room for other opinions (since @SimonSapin read it differently).

Personally, I am on the fence here. I somewhat agree with @SimonSapin that the number of regressions here feels too high. Part of the reason that I agree to go with a warning was to highlight this question so that we can have a wider discussion with more of the people who are affected.

For clarity, floating point literals at present match the same way as ==. Literals cannot be NaN. Therefore, I believe the only case where == can differ from "structural equality" with a floating point value has to do with 0 vs -0, which compare as equal but are clearly structurally unequal. (Does anyone have another example? I don't claim to be an expert on floating point edge cases.)

I take the following as a given (I'm curious if others disagree):

This seems to imply that if we don't make things a hard error, we can't have floating point NLC that use a purely structural match. They would always match with ==. I don't see any fundamental problem here: plausibly we could say that all user-defined types use structural equality and built-in types have their own customized behaviors. We could even tweak the behavior for NaN if we wanted, though I think that having matching behave almost like == but different (i.e., 0 == -0 but NaN == NaN) feels even more confusing to me and I'm probably opposed.

This may or may not affect plans around constant generics. It's not obvious to me that dynamic match evaluation and equality for constant generics have to be particularly related, but I can see the appeal of trying to have one notion of "structural equality" that applies equally. Maybe someone wants to make the case (@eddyb? @withoutboats?) that the two are entangled? (I'd like to hear it again.)

In other words, I see a few options:

I am not sure whether an RFC ought to be required, whichever path we chose, but since there seems to be disagreement, it seems plausible to open an RFC amendment to finalize the decision.

withoutboats commented 7 years ago

@nikomatsakis I don't think this needs to be entangled with const generics, we just need to transition to a solution for that as well as whatever we decide for match. Just renaming the attribute is probably fine.

I do feel that literals and consts should behave the same, but I don't have a strong opinion about how they ought to behave. Really whatever decision seems fine if it applies to both.

nikomatsakis commented 7 years ago

I don't think this needs to be entangled with const generics, we just need to transition to a solution for that as well as whatever we decide for match.

Let me unpack this a bit to be sure I understand. You're saying:

  1. Currently, the idea for const generics equality would have been to use the same #[structural_equality] attribute to denote types that are "structurally comparable for equality".
  2. But if we want we could have distinct attributes for the match system and for const generics.

Right?

(That said, it's not entirely clear to me why const generics would even need an attribute at all; I guess it depends a lot on how we wind up defining equality. My expectation was that equality would be based more on where in the source the expression arose (i.e., we would treat constant expressions as kind of "uninterpreted functions") -- and we'd always be assuming that said functions are deterministic (because of the limits we place on const fns), and hence we consider two constant expressions "equal" if they are the same function applied to equal inputs, where this bottoms out with simple integers and other builtins. But I guess eventually we might want to extend that notion of structural equality to other kinds of expressions and types, and maybe we want some opt-in around that, unsure.)

withoutboats commented 7 years ago

@nikomatsakis the issue here isn't with unevaluable const expressions but with floating point numbers and other types for which equality is not reflexive because of how that would impact unification. We don't need an attribute but it does seem that we do need const values to have a guaranteed reflexive definition of equality (which is why just using PartialEq seems problematic for consts).

darkwater commented 7 years ago

I might be missing something here, but I see this also applies to ranges, ie.

let color = match foo {
    0.0...0.1 => Color::Red,
    0.1...0.4 => Color::Yellow,
    0.4...0.8 => Color::Blue,
    _         => Color::Grey,
};

gives warning: floating-point literals cannot be used in patterns.

Is this intentional? I feel like this is a pretty common pattern.

fschutt commented 7 years ago

Yeah, I gotta ask, how am I supposed to do clean pattern matching with floating point ranges. I know floating points are hard, but isn't there a way to make this work in ranges? Example (real) code that gives a warning now:

/// Calculates the UTM zone this longitude falls in
/// Handles exceptions for Norway / Svalbard
/// For a visual representation: https://upload.wikimedia.org/wikipedia/commons/a/a5/UTM-Zone.svg
///
/// Inputs: Longitude, in degrees
///         Latitude, in degrees
///
/// Returns: UTM Zone
///
#[allow(non_snake_case)]
pub fn get_utm_zone(lon: f64, lat: f64) -> u8 {

    let mut zone = ((lon + 180.0) / 6.0).floor() as u8 + 1;

    match lat {
        56.0..64.0 => {
            /* Zone V, Norway */
            match lon {
                3.0..6.0 => {
                    zone += 1;
                }
                _ => {}
            }
        }

        72.0..84.0 => {
            /* Zone X, Svalbard */
            match lon {
                6.0..9.0 => {
                    zone -= 1;
                }
                9.0..12.0 => {
                    zone += 1;
                }
                18.0..21.0 => {
                    zone -= 1;
                }
                21.0..24.0 => {
                    zone += 1;
                }
                30.0..33.0 => {
                    zone -= 1;
                }
                33.0..36.0 => {
                    zone += 1;
                }
                _ => {}
            }
        }

        _ => {}
    }

    zone
}

How should I write this instead? Why is it that I can do comparisons of floats in if statements, but not pattern matching in range statements? I would have to write if value < 30.0 && value > 27.0 { }, which, in the end does the same thing, but less readable.

nikomatsakis commented 7 years ago

@darkwater @sharazam

I might be missing something here, but I see this also applies to ranges, ie.

Yes, there has been some back and forth on whether we ought to apply the same thing to floating points.

vks commented 7 years ago

It would be very unfortunate if floating point ranges were disallowed by this. I don't see any problems with this particular pattern.

kornelski commented 7 years ago

I was caught by this as well.

My case is:

let fudge_factor = match magic_value {
            0. ... 0.8 => 9., 
            0. ... 1.0 => 6.,
            0. ... 1.2 => 4.,
            0. ... 2.5 => 3.67, 
            _ => 3.0,
        };

To work around float comparison woes I'm using a hack of starting every range with 0. ... and depending on the specific match order instead.

I don't particularly like the current hacky/unclear/error-prone way of matching non-overlapping ranges of numbers, but I'd rather have something working until Rust gets a better replacement.

U007D commented 7 years ago

I waded through three or four threads to find the rationale for this (found it here: https://github.com/rust-lang/rust/issues/20489, in a comment dated Jul 18, 2015).

Niko Matsakis wrote: "right now, you can match on types that don't implement PartialEq at all, and if they do implement it, the match code semantics do not align with what PartialEq implements. This seems bad. Having match x { FOO => ... } and match x { y if x == FOO => } both be accepted but different in subtle ways seems clearly suboptimal to me."

Now I can understand (and agree with) the motivation; I thought I'd post this here in case others were looking for it too.

blakehawkins commented 6 years ago

Any update on this? Will float ranges ever get different treatment from matching over equality for float values?

Centril commented 5 years ago

@nikomatsakis How do you feel about crater running this to see if we can drive this to completion in the coming weeks or so?

Centril commented 5 years ago

cc https://github.com/rust-lang/rust/pull/56362

donbright commented 5 years ago

i feel like if Rust is going to go all-in on treating binary base-2 floats uniquely due to their special nature, then there would be a lot of help for us users if there could be a base-10 Decimal number type included in the base level of the language. . . including base-10-decimal literals, and then let the adventurous and wild amongst us convert willy nilly between base-2 floating points and our decimal literals with some function that we have to unwrap_or() or whatever. . .

elfeiin commented 5 years ago

What if we only care about positive float values?

bbqsrc commented 3 years ago

This seems to over-match.

let finite = FiniteF64::new(1f64).unwrap();

match finite {
    FiniteF64(1f64) => {
        std::println!("A")
    }
    FiniteF64(2f64) => {
        std::println!("B")
    }
    _ => {
        std::println!("C")
    }
}

FiniteF64 is a struct with one f64 field.

You get this warning:

warning: floating-point types cannot be used in patterns
   --> src/lib.rs:211:23
    |
211 |             FiniteF64(1f64) => {
    |                       ^^^^
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #41620 <https://github.com/rust-lang/rust/issues/41620>
nikomatsakis commented 3 years ago

@bbqsrc I don't think that's an 'overmatch', that seems like exactly the sort of case we are trying to lint against (in particular, to match finite we have to compare for equality against 1f64)

bbqsrc commented 3 years ago

@nikomatsakis yeah you're right. I misunderstood the RFC apparently. Makes perfect sense. 😄

workingjubilee commented 3 years ago

There are various well-spoken arguments in favor of match on floating point ranges or literals. But in addition to the difference between match and partial_cmp equality, the fact that a decimal float literal's actual concrete value depends on various nuances of the parser makes it extremely dicey to match directly on a floating point decimal literal. This rarely matters in simple comparisons, but e.g. floats would be extremely challenged in offering other niceties of a pattern match like exhaustiveness checking, and even if that was attempted, subtle variations in the dec2float algorithm could disrupt the code's compilation from compiler to compiler (including from rustc version to version).

So overall, even if one were permitted to match on decimal floating point literals in any way, with any equality heuristic, I think the only sound option would be to have to use literals that are forced to have an exact dec2float translation. For that reason I have filed PR #84045, to upgrade this warning to a denial.

razaqq commented 2 years ago

what about matching on special cases like f32::INFINITY, f32::NEG_INFINITY, f32::NAN these seem pretty reasonable to me

workingjubilee commented 2 years ago

NaN does not compare equal with NaN, and NaN bit-patterns are not guaranteed to be bit-equal either, as NaN can be:

...and there are multiple valid bitstrings for each combination.

razaqq commented 2 years ago

Okay but there are functions for checking for NaN and inf, why should the following be disallowed?

    match x / y
    {
        v if !v.is_finite() => None,
        v => Some(v)
    }

EDIT: actually nvm, the code above is not being flagged as illegal_floating_point_literal_pattern

workingjubilee commented 2 years ago

Yes, a pattern is subtly different. What you're talking about is a guard.

And I mean, I actually have come to think that if we allowed hexadecimal floating point literals that matching on them might be reasonable because they exactly specify floating point values. I just don't think it's appropriate for decimal floating point literals that may be rounded.

nicoburns commented 2 years ago

I would like to add my voice to this being reconsidered for literal values.

I have just had to replace the following code (yes, I do mean to match specifically on positive infinity):

track.growth_limit = match track.growth_limit {
    f32::INFINITY => track.base_size + item_incurred_increase,
    _ => track.growth_limit + item_incurred_increase,
}

with

track.growth_limit = if track.growth_limit == f32::INFINITY {
    track.base_size + item_incurred_increase
} else {
    track.growth_limit + item_incurred_increase
}

Due to this lint. And IMO the latter is significantly less readable than the former

NaN does not compare equal with NaN, and NaN bit-patterns are not guaranteed to be bit-equal either, as NaN can be:

  • signaling xor quiet
  • positive xor negative

...and there are multiple valid bitstrings for each combination.

Is there any reason why we couldn't have an ANY_NAN pattern that simply specifies all possible combinations?

I just don't think it's appropriate for decimal floating point literals that may be rounded.

Could we not specifically prohibit rounded literals (and have the error message specify the correct pre-rounded value to use if they wish to use it)?

edmondop commented 2 years ago

What's the right way to deal with those pattern matching in testing a function that returns an enum?

#[test]
fn second_state_is_reached()-> Result<(),anyhow::Error>{
    let parser = Parser::new(2);        
    let mut reader = valid_example_reader();
    let event = reader.read_event()?;
    let res = parser.process(event)?;
    assert!(matches!(res, StepResult::MovedForward(Parser{
        state: ParserState { 
            level: 2,
            time: 5.292 
        }
    })));
    Ok(())
}
rljacobson commented 1 year ago

For those who are wanting floats to retain nonstandard (for lack of a better phrase) equality behavior in matches, it seems to me what you really need is a type like OrderedFloat, which is a float type implementing Ord and Eq. Is there a reason one couldn't use such a type when the existing behavior is desired after primitive float matches are disallowed?

captainrdubb commented 1 year ago

I was caught by this as well.

My case is:

let fudge_factor = match magic_value {
            0. ... 0.8 => 9., 
            0. ... 1.0 => 6.,
            0. ... 1.2 => 4.,
            0. ... 2.5 => 3.67, 
            _ => 3.0,
        };

To work around float comparison woes I'm using a hack of starting every range with 0. ... and depending on the specific match order instead.

I don't particularly like the current hacky/unclear/error-prone way of matching non-overlapping ranges of numbers, but I'd rather have something working until Rust gets a better replacement.

to me this doesn't seem hacky. i think it's amazing that you can match on ranges like this, pretty convenient. on the other hand, the error you're trying to avoid is pretty sneaky. One would have to sufficiently test this code to avoid the error, and when the tests fail they would probably be confused as to why. if Rust is supposed to protect users from sneaky errors that necessitate unit testing and stack overflow searches, then the issue should either be documented or disallowed. honestly, i prefer the buyer beware approach. the compiler gives a nice warning. the warning can describe the issue or point to documentation about it. i think that we're still maintaining the quality of the language by making the user aware of the risks. No need to handcuff them.

donbright commented 1 year ago

(edited)... if disallowing it in match why not disallow it in == ? for that matter why does it allow the implicit conversion of base-10 decimal ascii into float at all? why not make let x = 0.3 ; an error

 let x = 0.3;
 println!("{:.40}",x);
 println!("{:?}",x==0.2999999999999999888977697537484345957637);
 println!("{:?}",x==0.3);
 println!("{:?}",x==0.2999999999999999888977697537484345957636);
 println!("{:?}",x==0.2999999999999999888977697537486);
 println!("{:?}",x==0.29999999999999998);
 println!("{:?}",x==0.2999999);

this returns

 0.2999999999999999888977697537484345957637, true, true, true, true, true, false

make it make sense. it doesn't make sense. we just live with the ambiguity and its ok for most programs. but it's problematic in match. that doesn't mean we have to totally ban floating point from match, anymore than we have to totally ban base-10 numeral literals from float literal definitions in programming languages. i think the issue is not floats but the representation of binary floats as base 10 numerals.

therefore i think a compromise could be made where some float literals should be allowed in match, like integers up to the limit of the largest int contiguously representable in the given float type, which can be exactly represented by a float. (in other words, there is some point in the float number line where integers "skip" and you can represent n but not n+1, but again you can represent n+2. so cut it off at n). for f32 this is around 16,777,217 and f64 bit its about 9,007,199,254,740,993 see so which probably covers most use cases.

matching on x = 0 for example is something you might like to use a lot when deciding whether to divide by x. and in trig calculations matching to x=1 is common and 1 is exactly representable in floating point binary.

you could also allow float literals that are not integers, if they are exactly representable in binary, which means you could allow fractions involving powers of 2. not sure of the notation.

edit maybe something like this

 match x {
    0.0..=0.5=>  // match when 0<x<=0.5  which are both exactly representable in floating point binary
    0.5+..=4.0=>  // match when 0.5<x<=4.0 which are both exactly representable in floating point binary
    4.0+..=4.5=>  // match when 4.0<x<=4.5 which are both exactly representable in floating point binary
    4.5+..=4.6 => /// ERROR, 4.6 is not exactly representable in floating point binary
    4.5..=4.75 => /// ERROR, 4.5 already included in match arm above (overlap)
    4.5+..<4.75 => // match when 4.5<x<4.75    which are both exactly representable in floating point binary
    4.75..=5 =>      // match when 4.75<= x <=5 , which are both exactly representable in floating point binary 
    _=> everything when x<0 and when x>5 including if x is NAN or infinity     
}

the compiler could check if the literals are exactly representable in floating point binary at compile time. it could also check for the "accidental holes" in the range the user is trying to describe as well, just like it currently does with ints.

frewsxcv commented 1 year ago

I probably missed it above, but does anyone have a simple concrete example of how a match on floating point numbers would be more misleading than a conditional on the same floating point numbers?

samuelpilz commented 1 year ago

I just had this example:

I have fn calc() -> Option<f64>

I wanted to write (maybe controversial style, but I like it)

// if is calc() is out of range
if !matches!(calc(), None | Some(0.0..=1.0) { ... }

I had to write (I dont demorgan when checking for intervals)

if let Some(c) = calc() {
  if !(0 <= c && c <= 1) {...}
}

Sure, there can be some issues about exhaustiveness or gaps or reflexivity, but these well-behaved ranges are quite nice.

The inverse is even more convoluted. Am I missing something??

// calc() is missing or in range
if calc().map(|c| 0 <= c && c <= 1).unwrap_or(true) {...}

// or:
let is_ok = if let Some(c) = calc() { 0 <= c && c <= 1 } else { true };
if is_ok {...}

// float patterns: 
if matches!(calc(), None | Some(0.0..=1.0)) { ... }
RalfJung commented 1 year ago

FWIW https://github.com/rust-lang/rust/pull/84045 (which wanted to turn this into a hard error) got rejected by the lang team. So it seems like the "future compatibility" status of this lint is not accurate -- this will not be made a hard error.

We might want a lint that specifically warns about NaNs and +/- 0 in patterns because they don't behave like a bit-wise equality test, but that's quite different from what this lint currently does.

nikomatsakis commented 1 year ago

Ah, good call out @RalfJung, we should probably change the status to jjust a lint?

RalfJung commented 1 year ago

Yes, and adjust the wording and name.

But also, why would we lint against matches!(x, 1.001) but not against x == 1.001? Their semantics is exactly the same...

RalfJung commented 1 year ago

IMO we should just entirely remove the lint, without replacement.

cyqsimon commented 1 year ago

So how do we move this forward? Being stuck in a state of limbo is worse than either outcome.

RalfJung commented 1 year ago

There's a design meeting this week where the lang team will look at this issue and the surrounding issues related to pattern matching.

rdrpenguin04 commented 11 months ago

What was the outcome?

RalfJung commented 11 months ago

Unfortunately we didn't even get to floats, we were mainly discussing consts in patterns in general. Here is something of a summary.

NicoElbers commented 9 months ago

Has there been any discussion since?

RalfJung commented 9 months ago

Yes: https://github.com/rust-lang/rfcs/pull/3535