Mechanical-Advantage / AdvantageKit

Logging & replay framework for FRC
GNU General Public License v3.0
155 stars 45 forks source link

AdvantageKit abnormal CAN spiking #71

Open Ne-k opened 7 months ago

Ne-k commented 7 months ago

Recently we've noticed that our CAN usage on our robot would spike up to 100% usage with no explanation. We've ran some tests with isolating just AdvantageKit on a new project, and found that with just AdvantageKit installed, whether if we enable logging or not inside our Robot.java file, the can would still have that abnormal spike.

More information to what I'm talking about can be found here on the Chief post I made yesterday.

jwbonner commented 7 months ago

Looking at the data from your post here, I'm not clear on what the issue is. The log from your base project shows a median CAN usage of 42%, and the log from your AdvantageKit only project shows a median CAN usage of 38%. There are occasional jumps, but this is normal for CAN utilization measurements.

There aren't any conditions where AdvantageKit will write to the CAN bus such that it would increase utilization. The only interaction is the PDP/PDH logging, which reads frames that are already being transmitted. It seems likely that YAGSL is configuring devices in a way that changes CAN utilization (which would be expected while setting up devices), but I don't see a connection to AdvantageKit here.

Ne-k commented 7 months ago

The thing is, we're noticing CAN utilization jumps only when AKit is running on the robot. Blank projects or projects with only YAGSL don't exhibit the same CAN spikes, even when on and running code (i.e. driving). That would seem to suggest AKit is either changing the RIO's CAN utilization measurement system somehow or affecting CAN utilization. I'm not familiar, so, does AKit modify how the RIO logs CAN utilization in any way? Are the jumps to 100% and 0% CAN utilization really inconsequential? I'm hesitant to just brush them off as our team has had a lot of issues related to CAN utilization, especially last season.

jwbonner commented 7 months ago

Regardless of the cause, the spikes in utilization are definitely issues with the measurement only and won't affect the operation of the CAN bus (there used to be a similar issue on the RIO when the sampling was incorrect).

Given that AdvantageKit is reading the CAN utilization periodically for logging, one possibility is that there is a bug in netcomm where polling for the CAN status affects the samples (e.g. it's sampling between calls to the read method, or something similar). You could test that by taking your blank robot project and adding RobotController.getCANStatus() in robotPeriodic and seeing if there's a similar effect.

jwbonner commented 7 months ago

I just tested in a blank WPILib project either not calling getCANStatus() or calling it periodically every 20ms and every 2ms. The CAN utilization from the DS was stable when not calling it, noisy like your AdvantageKit data when calling it every 20ms, and extremely noisy when calling it every 2ms. It's still nothing to worry about in terms of CAN errors (it's just about the measurement), but this looks like an issue with NI's software and not AdvantageKit.

jwbonner commented 7 months ago

I reduced the rate of polling CAN utilization, which should reduce the noise in the measurements.

Ne-k commented 7 months ago

Sorry for the late response, I haven't been in recently to work on the robot, I'll be sure to play around with the changes when I get the time today. Thank you for your help!