LaserCan crashes code if CAN requests time out

mjansen4857 commented 10 months ago

Attempting to configure a LaserCan while its not actually on the CAN bus will throw an exception and "crash" the code (it exits code with a 0 error code which is pretty confusing for unit tests since the test won't actually fail)

I'm assuming the same thing would happen if trying to get the current measurement, but we don't yet have our sensors in hand to test.

This is extremely problematic for a couple of reasons:

If we lose connection to the sensor during a match, robot code will crash
Sensor can not be used in simulation without crashing

2 can be worked around, but 1 is a major issue that would prevent us from using the sensor. Losing CAN connection to 1 sensor would make the robot completely non-functional for the entire match.

Timed out CAN requests should be handled gracefully, at the most printing a warning/error (preferably printing nothing in sim).

JaciBrunning commented 10 months ago

Assuming you're using Java - the timeout exceptions only occur during configuration (setRangingMode, setTimingBudget, setRoi). The getMeasurement() function is infallible and will just return null if there's no value to be read or the oldest measurement is too old, so unless your robot code has crashed anyway and it tries to reconfigure, there's no risk of crashing your code mid-match.

The main intention behind this is to make configuration set failures obvious and to avoid silent/near-silent configuration failures (a-la RevLib's burnFlash). Graceful handling of communications issues I think should happen in the user code since it's ultimately the user code that decides how the system will respond to the error. In an intake, it may be fine to discard the errors and go sensor-less, but for elevator height measurement it could be fatal and should be caught early.

I could annotate the configuration methods with a throws tag to force user code to either catch or re-throw the error. The other option is to return a flag similar to how we do on C++ to say whether it succeeded or not, but in Java that's a bit of an anti-pattern.

For simulation, I'm working on getting simulation harness support in the next libgrapplefrc release - which should fix reason 2.

As for the 0 exit code, that's curious. We throw the exception as would any other piece of Java code, so not sure why you're getting a 0 exit code. It may be an implicit catch by your unit testing framework.

mjansen4857 commented 10 months ago

OK. I think I'd argue that these configuration methods should return a status code enum instead of throwing an exception, simply to bring it in line with how config stuff like this works in Phoenix and RevLib. It's a lot easier for the students and for those teaching them if they can expect the different vendor libs to function similarly. Status codes are also a bit nicer/expandable for doing system check logic. For example, if we detect an error status code, does it indicate that the sensor is missing from the CAN bus, is it connected but malfunctioning, etc. vs one general exception type or needing to catch a bunch of different exceptions.

As for the 0 exit code, not sure how that's happening. It happens in normal sim as well so it's not just a unit test thing. This made me think the exception was happening in another thread that was calling System.exit or something which would prevent us from even catching the exception. But, we are able to catch it so... weird. We've had other exceptions thrown elsewhere that didn't do that. The mysteries of java.

JaciBrunning commented 10 months ago

I think I've found a kind of happy medium: Java: declare the function as throws ConfigurationFailedException, which the user can opt to catch or re-throw. If they catch it, the exception type includes a getErrorCode() function which allows you to distinguish between timeout, invalid parameters, etc.

C++: return a std::expected<> which contains either the ok value (in most cases, an empty struct), or an error (which similarly contains the error string and the error code). Since this isn't as verbose as a throw, C++ will also echo it to the console.

JaciBrunning commented 10 months ago

This should now be done in release 2024.2.1, just give it a minute to publish through CI. I'll keep working on simulation stuff in the background - the WPI CAN Sim is making it a little more difficult than I had planned.

GrappleRobotics / libgrapplefrc

LaserCan crashes code if CAN requests time out #11