wpilibsuite / allwpilib

Official Repository of WPILibJ and WPILibC
https://wpilib.org/
Other
1.03k stars 607 forks source link

Rotation2d(0,0) should be some sort of NaN #6766

Open truher opened 1 week ago

truher commented 1 week ago

Describe the bug The specific thing we noticed was that if a single swerve module stops momentarily (e.g. imagine the drivetrain translating and rotating, so that each module takes a cycloidal path), then the steering control momentarily freaks out. The root cause was the swerve inverse kinematics use of Rotation2d to represent the steering angle, based on velocity. When both x and y module velocities are zero, the module angle is actually undefined, but Rotation2d chooses zero as the result (potentially very far from the current angle), and the consumers of this angle treat it as real. To work around this issue, we have to keep some other indicator of the validity of the returned rotation (e.g. the module speed or something else), and it's easy to forget.

Expected behavior It would be cool if the Rotation2d object had some intrinsic way to represent (0, 0) that couldn't be conflated with a real value. I realize that atan2(0,0) == 0 in Java, but I think that's also a tragedy, it should be NaN. :-)

calcmogul commented 1 week ago

Returning NaN instead would infect every other math calculation in the user's code base that uses that angle, and could potentially turn a not-that-serious edge case into more systemic breakage. That is, instead of the code doing something sane with zero, everything will become NaN, because any math done to NaN returns NaN.

Logging a fault in the constructor sounds like a good idea though.

truher commented 1 week ago

yeah, i agree that some folk might interpret this as "creating breakage" though i'd argue that it's already silently broken. :-) a compromise might be to add Rotation2d.isValid(), to avoid duplicating the validity checking elsewhere. users who care can check it, users who don't care can ignore it.

calcmogul commented 1 week ago

some folk might interpret this as "creating breakage"

In particular, it's creating more breakage where there wasn't before, because NaN propagates outward.

a compromise might be to add Rotation2d.isValid(), to avoid duplicating the validity checking elsewhere. users who care can check it, users who don't care can ignore it.

Dead objects are a code smell. It's better to restore the object to a sane state (i.e., recover from the fault) when possible and log the fault occurrence. In cases where the user gave bad constructor inputs and we can't recover, we typically just throw because constructors are usually invoked at robot code startup.

truher commented 1 week ago

Yeah, I get the software engineering issues. I think the crux of the issue here is whether silently coercing NaN to zero is "sane." IMHO this is an example of mixing metadata and data, and it would be good (and worth a little bit of "feel bad" time) for FRC students to learn that this sort of failure mode exists, and how to avoid it. (one of the steps, though admittedly not the most important step, in the chain of the Ariane 5 maiden failures was this sort of metadata/data mixing: http://sunnyday.mit.edu/nasa-class/Ariane5-report.html ) But I also understand if you'd rather allow incorrectness in the name of minimizing student feel-bad time -- since nobody has mentioned this issue before now, it's clearly not a big deal either way.

Anyway, your PR looks like you want Rotation2d to do the "reporting" part but not the "recovering" part, i.e. you want us to wrap Rotation2d constructors in our own zero-checkers? Is that the intended pattern?

calcmogul commented 1 week ago

But I also understand if you'd rather allow incorrectness in the name of minimizing student feel-bad time

Returning NaN isn't correct either. When x and y are both zero, a singularity occurs, so arbitrarily small input noise makes any angle equally likely. Like I said before, NaN makes the problem worse, not better due to its infectious nature on calculations. It's guaranteed to cause a cascading failure as opposed to only maybe causing one if zero is a bad default.

If this were Rust, I'd have Rotation2d::new() (Rust's version of a constructor) return an error result to force the user to handle the fault explicitly. Unfortunately, Java's only built-in method for mandatory fault management is checked exceptions, which have poor ergonomics, and Java's Optional type is kinda expensive to use. Maybe we could make a custom Result type?

Anyway, your PR looks like you want Rotation2d to do the "reporting" part but not the "recovering" part, i.e. you want us to wrap Rotation2d constructors in our own zero-checkers? Is that the intended pattern?

It does do fault recovery, just potentially not the recovery method the user wants in all cases: it replaces a NaN with a zero so the error doesn't propagate outward. The goal of that PR is to make the error more visible with a stacktrace so users know to fix it the next time they look at the console/logs. We want to make faults more visible with https://github.com/wpilibsuite/shuffleboard/pull/805 as well.

If you want to avoid the fault altogether, then yes, you'd need to wrap all your Rotation2d constructions with input validations.

truher commented 1 week ago

Is there like a recommended result type for this kind of thing? Our default seems to be Optional, which doesn't seem exactly right.

calcmogul commented 1 week ago

Yea, we need an actual result type instead of Optional for this, so we can communicate what the exact error is. Java doesn't have one, so we'd have to make one.

KangarooKoala commented 1 week ago

How would we use a different result type for the constructor? Are we talking about a factory method instead?

Also, are there concerns about the performance impact (from more allocations because of the wrapper object) outweighing the benefit from making users handle this edge case? From what I've heard the only known impacts from ignoring this case is the poor steering control in certain cases, which seems like something that might be better addressed in some other way.

calcmogul commented 1 week ago

How would we use a different result type for the constructor? Are we talking about a factory method instead?

Yes, we'd have to use factory methods everywhere and make the constructor private, since constructors don't have return types (a design flaw in classical OOP, imo).

are there concerns about the performance impact (from more allocations because of the wrapper object) outweighing the benefit from making users handle this edge case?

It would add another allocation every time a user makes a new object directly, but operations like plus() could skip the Result type internally and in the return value if there's no error conditions.

From what I've heard the only known impacts from ignoring this case is the poor steering control in certain cases, which seems like something that might be better addressed in some other way.

WPILib's SwerveDriveKinematics.toSwerveModuleStates() function returns the last known angle if the speeds go to zero. https://github.com/wpilibsuite/allwpilib/blob/main/wpimath/src/main/java/edu/wpi/first/math/kinematics/SwerveDriveKinematics.java#L122-L130

KangarooKoala commented 1 week ago

but operations like plus() could skip the Result type internally and in the return value if there's no error conditions.

I guess I don't have a concrete view of the proposed class structure- Is the thought A) add Result<ValueType, ErrorType> which is returned by the factory method and add Rotation2d.plus(Result<Rotation2d, Rotation2dError>) overloads or B) add a Rotation2d-specific result type to allow Rotation2dResult.plus(Rotation2dResult) (without having to unwrap the result on the left-hand side)?

WPILib's SwerveDriveKinematics.toSwerveModuleStates() function returns the last known angle if the speeds go to zero. https://github.com/wpilibsuite/allwpilib/blob/main/wpimath/src/main/java/edu/wpi/first/math/kinematics/SwerveDriveKinematics.java#L122-L130

Good to know! So then, what is the usecase that runs into poor behavior with new Rotation2d(0, 0)?

calcmogul commented 1 week ago

I guess I don't have a concrete view of the proposed class structure- Is the thought A) add Result<ValueType, ErrorType> which is returned by the factory method and add Rotation2d.plus(Result<Rotation2d, Rotation2dError>) overloads or B) add a Rotation2d-specific result type to allow Rotation2dResult.plus(Rotation2dResult) (without having to unwrap the result on the left-hand side)?

Neither of those is how Rust works. In Rust, you make any function that could produce an error return a Result<ValueType, ErrorType> or Optional<ValueType>. Exceptions don't exist. Users check the error condition then unwrap the result or optional so the value type can be used by other functions. If a function can't produce an error, it just returns ValueType like normal.

What this means for Java is we'd replace public constructors with public factory functions (e.g., Rotation2d::new()) that return either ValueType or Result<ValueType, ErrorType>/Optional<ValueType> depending on whether errors can occur. Function parameters that take Rotation2d do not have their type changed, because we want to force users to handle the error, or explicitly opt out by unwrapping the type without checking, and causing an exception to be thrown if there was an error. Rust calls panic!() in that case, but idk if Java has an equivalent for our use case.

KangarooKoala commented 1 week ago

Ah, thanks! It makes a lot more sense now. I think the part I was missing was that skipping the result type meant using the value type.

causing an exception to be thrown if there was an error (Rust calls panic!() in that case, but idk if Java has an equivalent for our use case)

Off the top of my head, how suitable would NoSuchElementException be? That's what [Optional.get()](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Optional.html#get()) throws if there isn't a stored value. (Or maybe it would be worth it to define our own exception class?)

calcmogul commented 1 week ago

That exception would work.

truher commented 1 week ago

the case I'm talking about is where the robot is moving (so the case in swervedrivekinematics is not triggered) but an individual module is not moving. imagine the drivetrain rolling along the side wall. in this case, the module twitches when it stops. the twitching is not that big a deal, except that we have logic to prevent the robot from driving the wheels the "wrong" way, and the steering profile also respects the steering velocity (which is very high in a bad direction), so the whole robot waits for the errant wheel to realign.

truher commented 1 week ago

@calcmogul any opinion about https://github.com/mrkloan/result-type ?

KangarooKoala commented 1 week ago

I'm not sure if it's worth it to bring this in from a library vs implement it ourselves given how small it is and that we might want to add, remove or rename the available methods. (It also seems that we use a different style checker) Just my two cents, though.

Also, at the moment nothing's jumping out to me as a reason to have separate Ok and Error classes (since we'd likely have all constructors be private and have users use factory methods), but maybe I'm missing something.

truher commented 1 week ago

oh, yeah, i wouldn't want to literally use it, but maybe borrow some ideas from it, if they're good ideas.

KangarooKoala commented 1 week ago

Based on my (possibly incorrect) understanding of what we're thinking of, it matches the same basic concept we're looking for. We probably want to have the error type be a type parameter, though, and assuming we lean into the Rust inspiration, the error type would probably be mostly with the intent of communicating to the program what happened rather being something to be thrown. (See https://doc.rust-lang.org/nightly/book/ch09-02-recoverable-errors-with-result.html)

The methods provided in the linked repository do seem to have reasonable similarity with C++'s std::expected and our shim (also see the doxygen page), with map being the equivalent to and_then, flatMap to transform, switchIfError to or_else, and mapError to transform_error. (I'm not sold on the names, but since we're just talking about concepts, that's fine.)

Alextopher commented 2 days ago

In my opinion I would keep the existing behavior (but add the warning as per #6767). I would add new factory method, something like newChecked and I would avoid using a custom error type and just stick to either throwing an IllegalArgumentException (that's what this is afterall) or return Optional<Rotation2d> for better ergonomics.

  public static Rotation2d newChecked(double x, double y) throws IllegalArgumentException {
    double magnitude = Math.hypot(x, y);
    if (magnitude < 1e-6) {
      throw new IllegalArgumentException("Vector magnitude too small to calculate direction.");    
    }
    return new Rotation2d(x, y);
  }

If running this in a hot loop I think the JIT would inline the constructor and optimize away the duplicate hypot call.