wpilibsuite / allwpilib

Official Repository of WPILibJ and WPILibC
https://wpilib.org/
Other
1.07k stars 614 forks source link

USB (joysticks) Not Correctly Recognized by Driverstation #3896

Closed Crossle86 closed 2 years ago

Crossle86 commented 2 years ago

Describe the bug 2022 Driver Station appears to recognize joysticks, green joystick indicator in center of DS display. Left side display of JS shows my 3 JS and launch pad. Selecting any of these items shows that they work, the DS displays stick deflection as expected and also displays any button presses. All good so far. However in the log window right side of DS are many repeating error messages from the DS code saying: Button n on port n is missing. Are your joysticks plugged in? (paraphrase). This just goes on an on and no JS data is sent to the robot. This happens on two different Windows 10 PCs. I have tried putting everything back to 2021 configuration and the joysticks and 2021 DS work correctly.

So far I am unable to operate the robot as no JS data or buttons are arriving at my robot code.

To Reproduce Steps to reproduce the behavior: See above

Expected behavior JS recognized with no errors in DS log window and JS data passed to the robot.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

jasondaming commented 2 years ago

Are you getting these messages during teleop or all the time? Can you screenshot both the error message and the joystick working? Is your code publicly available where we can take a look at it? If you switch the port the joystick is expected on (and move the joystick to that port) does anything change?

Not saying this can't be a bug but sounds like a software issue.

Crossle86 commented 2 years ago

I hear what you are saying. The error message continues to appear in log at all times, even when disabled. Now later today I am going to recheck all this again, but:

If I revert the driver station back to 2021 the joystick issue goes away, and things appear normal but no joystick data  reaches the robot code so while the robot is enabled and sensor information is passed back (visible in outlineviewer and shuffleboard), with no joystick data we can't operate the robot. This may be because the RoboRio image is 2022. Now if I roll back the RoboRio image to 2021, everything works normally. Through all these tests, the robot code is the same and is our 2021 code updated for 2022, so is well tested.

I will get a screen shot and DS log file this afternoon. Our code is at: https://github.com/ORF-4450/Robot22B2

Are you getting these messages during teleop or all the time? Can you screenshot both the error message and the joystick working? Is your code publicly available where we can take a look at it? If you switch the port the joystick is expected on (and move the joystick to that port) does anything change?

Not saying this can't be a bug but sounds like a software issue.

--

Reply to this email directly or view it on GitHub:

https://github.com/wpilibsuite/allwpilib/issues/3896#issuecomment-1009552088

You are receiving this because you authored the thread.

Message ID:

Richard Corn

Software Devices LLC

P.O. Box 12299, Olympia WA. 98508

@.***   www.swdevs.com

Crossle86 commented 2 years ago

Belay that. I was reviewing things again and I do see an error in our team library build.gradle. Let me fix that and retest and before you spend time on this. 

Are you getting these messages during teleop or all the time? Can you screenshot both the error message and the joystick working? Is your code publicly available where we can take a look at it? If you switch the port the joystick is expected on (and move the joystick to that port) does anything change?

Not saying this can't be a bug but sounds like a software issue.

--

Reply to this email directly or view it on GitHub:

https://github.com/wpilibsuite/allwpilib/issues/3896#issuecomment-1009552088

You are receiving this because you authored the thread.

Message ID:

Richard Corn

Software Devices LLC

P.O. Box 12299, Olympia WA. 98508

@.***   www.swdevs.com

Crossle86 commented 2 years ago

I have fixed the issue with our library build.gradle but it made no difference to the problem. Let me correct my earlier description of the problem:

During beta I had problems with JS data not being received by my code. After kick off I installed all of the release versions of the control system software. When I first started testing with release software is when the USB problem appeared. With my earlier JS data issue, if I reverted everything back to 2021 things worked correctly. With this new issue, I have only tried reverting the DS to see how 2021 version recognized the JS (RoboRio image stayed 2022) and the problem with the JS is present with the 2021 DS. The DS appears to recognize and respond to the JS but the joystick not plugged error repeats constantly. I don't know how all this works under the covers, but it appears to me that the joystick not plugged in error is triggered from the Robot code when I tries to allocate a JS button. So it may be that the original problem of JS data not being passed to the robot may also be causing the JS not plugged in error.

I have attached screen shots of the log viewer and log files. The is also a picture of the DS with JS 0 trigger being held and deflection on the Y axis and the error message on the right JS Problem JS Problem2 2022_01_11 14_41_41 Tue.dslog.txt 2022_01_11 14_41_41 Tue.dsevents.txt .

jasondaming commented 2 years ago

If you use an example program do you experience the same issues? I am trying to determine if this is a problem with the way your are doing things specifically in your code or a computer / driver problem with the computer.

Crossle86 commented 2 years ago

I will organize a test. I will say this code is last years working code with minor adjustments for 2022 compatibility. I have reverted to 2021 (before Christmas) and it all works. I will revert again (when I have time) to re-verify the truth of that.

sciencewhiz commented 2 years ago

That driver station log looks horrible. CAN utilization all over the place and a ton of missed radio packets / high latency. The latter is likely the reason for joystick issues.

I like Jason's suggestion to use an example program. The CAN utilization would be resolved by that, and you can compare the driver station log. How are you connected to the robot? I would try a USB connection to remove any wireless/radio issues.

Crossle86 commented 2 years ago

Good idea on USB robot connection. Will try that. Example program works. So there is something in our code that worked up until the 2022 WPILib, just need to figure out what it might be.

When you say compare the DS logs, what would I be looking for?

I noticed the CAN/packet loss/latency as well. An ideas on what might cause that? I'm just a bit lost as as best I can tell we have changed nothing in our code from 2021 other than the minimum to get compatible with 2022.

Crossle86 commented 2 years ago

Here is an update on where things stand. Using the Example tank drive program created by VSC plugin, we see that the example program works as expected and the log looks normal (attached).

This result points at our code. Our code inherits a customized version of TimedRobot. The reasons for that we could get into later, but I removed that customization going back to the WPILib TimedRobot. This eliminated the problem with the joysticks not being recognized and throwing errors in the log. However, the problem with the JS data not being received by the robot code persists and the strange log data with high can%/packet loss/latency (and higher RR cpu usage) is observed (attached). I will say that the can% looks better but seems high, but I don't have the knowledge to judge at what level can usage becomes a problem. Same for RR cpu%, I don't know at what level that might interfere with communication.

I connected to the robot with USB cable and the problem persists so it does not seem to be the link. Any ideas on what to do next?

ExampleCode-WiFi OurCode-USB

OurCode-WiFI

.

Crossle86 commented 2 years ago

So first picture is example code. Second is our code over wifi and third is our code over usb.

Crossle86 commented 2 years ago

It is interesting that the can/loss/latency/cpu% are high even when robot is disabled. Suggests (at least for cpu%) something might be running during disabled period that should not be. I have reviewed the code looking for anything in a subsystem periodic function that might explain this but haven't found anything going on other than network table updates of some status values.

jasondaming commented 2 years ago

What all is on the CAN bus on this robot?

Crossle86 commented 2 years ago

PDB, PCM, TalonSRX(7), TalonFX(1), VictorSPX(6), SRXMagenticEncoder(2).

Crossle86 commented 2 years ago

A question: The code I am using is of beta-2. I modified the build.gradle to reflect 2022.1.1 release and compared my build.gradle to one created by making a new 2022 project to make sure they are the same. Now VSCode extension indicates this code is not compatible and wants me to import into a new project. I have not done that because our project is quite customized and it is a pain to copy everything over to a new project.

I have customized our build.gradle and have lots of comments for our students to understand what is going in that file. I do compare this customized file with the one created by the new project wizard in VSC to make sure the important items match exactly. I have assumed this is ok but it is possible I have overlooked something?

ThadHouse commented 2 years ago

The vscode message is toggled by the frcYear field in wpilib_properties.json. If you know your gradle files are good, you can change that to 2022 to stop that message.

Starlight220 commented 2 years ago

Regardless of whether CAN bus utilization is a cause of the problem here (I don't think that ~60% bus utilization would cause the issues you're seeing), I suggest reading up on CAN status frames and how to limit them to lower bus utilization. I found this page very informative and helpful. Setting followers instead of individually controlling each motor controller can also help decrease bus utilization. Running sensors and control loops on motor controllers can decrease RIO CPU load.

Crossle86 commented 2 years ago

build.gradle is ok, just wondering if there is anything else that is important.

Crossle86 commented 2 years ago

As to the comment about can bus, we do use follower on ganged motors and have reduced some status frames. This code worked perfectly fine 2020 and 2021. Problems only started with upgrade to WPILib 2022. Not trying to blame WPILib per se but it seems some incompatibility has occurred.

I also find it interesting that the JS data transmission is 100% not working instead if intermittently working which you might expect with load issues. Also, transmission of data from robot back to shuffleboard appears to work normally, though it is happening at a slower rate.

Crossle86 commented 2 years ago

This turning into a real nightmare. Previously when I ran the simple example code, the log viewer data looked normal as I understand it. Today I deployed the example code and while the code worked, joysticks controlling motors as expected, the log viewer data is shown in the first image. This is not at all what I expected for the example. It looks like the display our code produces...but the JS worked.

Next I deployed our code and got the second image. Again, with our code the JS are 100% non functional. However, networktables data sent from the robot appear as expected on the DS.

I am at my wits end on what to do about this. I have looked at everything I can think of and found nothing.

ExampleCode-WiFi-Thursday OurCode-WiFi-Thursday .

ThadHouse commented 2 years ago

Can you do something for me. Fire up 2 threads on start. Have 1 spin calls to DriverStation.waitForData, and the other spin HAL.waitForDSData? And when each of those returns, increment a counter and send that counter over NT. I want to make sure those events are still firing.

Also, try grabbing DS data with raw HAL calls. HAL.getJoystickAxes will let you grab raw data.

Crossle86 commented 2 years ago

Great idea. Unfortunately real life away from FRC will keep me busy until next week. Will do this as soon as I can.

Crossle86 commented 2 years ago

Sweet! Now I don't feel like I am hallucinating this whole thing. BTW, I also see spurious "stuck" inputs on one of the JS when I enable but mostly inputs are zero all the time. Another note, once in a while when I enable the JS seem to work for a few seconds (sluggishly) then nada. This has happened a couple times over many tests but it is interesting as those other reports seem to say they are operating and it quits, whereas my JS do not function at all. I haven't driven the robot since updating to 2022 beta program.

Crossle86 commented 2 years ago

Can you point me to an example or doc on how to make use of HAL.getJoystickAxes?

ThadHouse commented 2 years ago

In the getData function in driverstation.java

eohara0920 commented 2 years ago

I can see these functions but not sure what is the best way to check they are updated, could you help provide some sample code to do as you described above?

Would like to help wherever I can to solve this issue.

ThadHouse commented 2 years ago

Theyre wait calls, so just call them in an infinite loop inside of a thread (1 per thread), and then increment a counter, that you send over network tables each time they return.

void DSThreadFunc() {
   int count = 0;
   while(true) {
     DriverStation.waitForData();
     count++;
      SmartDashboard.PutNumber("DS Count", count);
   }
}

That would be the thread function for the DS call. Do the same thing for the HAL call, and then the joystick call you can just do from robotPeriodic.

float[] axes = new float[HAL.kMaxJoystickAxes];
int axesCount = HAL.getJoystickAxes(0, axes);
SmartDashboard.putNumber("Axis 0", axes[0]);
eohara0920 commented 2 years ago

Thank you so much! I will try this tomorrow morning and see what results I get.

eohara0920 commented 2 years ago

Hello,

I have not figured out where to put the DSThreadFunc() or the HAL one but i did place the joystick call in robotPeriodic() and the values do update when moving the joystick even when the robot stops responding but the following functions also in robotPeriodic() do not update values. This is using an XboxController.

SmartDashboard.putNumber("ControllerLeftY", m_controller.getLeftY()); SmartDashboard.putNumber("ControllerLeftX", m_controller.getLeftX());

image

These values were identical until the robot stopped responding and then only the Axis 0 value would update.

PeterJohnson commented 2 years ago

Add these two imports:

import edu.wpi.first.hal.HAL;
import edu.wpi.first.wpilibj.DriverStation;

Add two robot instance variables:

  private Thread m_dsThread;
  private Thread m_halThread;

Put the following code into your robotInit():

    m_dsThread = new Thread(() -> {
      int count = 0;
      while(true) {
        DriverStation.waitForData();
        count++;
         SmartDashboard.putNumber("DS Count", count);
      }
    });
    m_dsThread.setDaemon(true);
    m_dsThread.start();

    m_halThread = new Thread(() -> {
      int count = 0;
      while(true) {
        HAL.waitForDSData();
        count++;
         SmartDashboard.putNumber("HAL Count", count);
      }
    });
    m_halThread.setDaemon(true);
    m_halThread.start();
eohara0920 commented 2 years ago

Thanks,

After doing this the following situation happens when the robot will not respond. HAL Count continues to climb but DS Count stays stuck at the same value

image

After some time

image

sciencewhiz commented 2 years ago

When this occurs, can you capture a stack trace of all threads. See https://www.cleantutorials.com/jconsole/how-to-print-stack-trace-in-java-and-analyze-thread-states-with-example for an example of how to do this.

eohara0920 commented 2 years ago

Here you go

Thread Name:Thread-0   Thread State:RUNNABLE   app//edu.wpi.first.hal.NotifierJNI.waitForNotifierAlarm(Native Method)   app//edu.wpi.first.wpilibj.Watchdog.schedulerFunc(Watchdog.java:231)   app//edu.wpi.first.wpilibj.Watchdog$$Lambda$5/0xad917628.run(Unknown Source)   java.base@11.0.12.5-frc/java.lang.Thread.run(Unknown Source)       Thread Name:Signal Dispatcher   Thread State:RUNNABLE       Thread Name:Reference Handler   Thread State:RUNNABLE   java.base@11.0.12.5-frc/java.lang.ref.Reference.waitForReferencePendingList(Native Method)   java.base@11.0.12.5-frc/java.lang.ref.Reference.processPendingReferences(Unknown Source)   java.base@11.0.12.5-frc/java.lang.ref.Reference$ReferenceHandler.run(Unknown Source)       Thread Name:Finalizer   Thread State:WAITING   java.base@11.0.12.5-frc/java.lang.Object.wait(Native Method)   java.base@11.0.12.5-frc/java.lang.ref.ReferenceQueue.remove(Unknown Source)  Warning  1  Loop time of 0.02s overrun  edu.wpi.first.wpilibj.IterativeRobotBase.printLoopOverrunMessage(IterativeRobotBase.java:359)   java.base@11.0.12.5-frc/java.lang.ref.ReferenceQueue.remove(Unknown Source)   java.base@11.0.12.5-frc/java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)       Thread Name:Common-Cleaner   Thread State:TIMED_WAITING   java.base@11.0.12.5-frc/java.lang.Object.wait(Native Method)   java.base@11.0.12.5-frc/java.lang.ref.ReferenceQueue.remove(Unknown Source)   java.base@11.0.12.5-frc/jdk.internal.ref.CleanerImpl.run(Unknown Source)   java.base@11.0.12.5-frc/java.lang.Thread.run(Unknown Source)   java.base@11.0.12.5-frc/jdk.internal.misc.InnocuousThread.run(Unknown Source)       Thread Name:navXIOThread   Thread State:TIMED_WAITING   java.base@11.0.12.5-frc/java.lang.Thread.sleep(Native Method)   app//edu.wpi.first.wpilibj.Timer.delay(Timer.java:48)   Warning at edu.wpi.first.wpilibj.IterativeRobotBase.printLoopOverrunMessage(IterativeRobotBase.java:359): Loop time of 0.02s overrun     app//com.kauailabs.navx.frc.RegisterIO.run(RegisterIO.java:82)   app//com.kauailabs.navx.frc.AHRS$IOThread.run(AHRS.java:1389)   java.base@11.0.12.5-frc/java.lang.Thread.run(Unknown Source)       Thread Name:NTEntryListener   Thread State:RUNNABLE   app//edu.wpi.first.networktables.NetworkTablesJNI.pollEntryListener(Native Method)   app//edu.wpi.first.networktables.NetworkTableInstance.lambda$startEntryListenerThread$0(NetworkTableInstance.java:218)   app//edu.wpi.first.networktables.NetworkTableInstance$$Lambda$36/0xaa8c7628.run(Unknown Source)   java.base@11.0.12.5-frc/java.lang.Thread.run(Unknown Source)       Thread Name:Thread-1   Thread State:WAITING   java.base@11.0.12.5-frc/jdk.internal.misc.Unsafe.park(Native Method)   java.base@11.0.12.5-frc/java.util.concurrent.locks.LockSupport.park(Unknown Source)   java.base@11.0.12.5-frc/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)   app//edu.wpi.first.wpilibj.DriverStation.waitForData(DriverStation.java:1102)   app//edu.wpi.first.wpilibj.DriverStation.waitForData(DriverStation.java:1062)   app//frc.robot.Robot.lambda$robotInit$0(Robot.java:55)   app//frc.robot.Robot$$Lambda$43/0xaa8d4428.run(Unknown Source)   java.base@11.0.12.5-frc/java.lang.Thread.run(Unknown Source)       Thread Name:FRCDriverStation   Thread State:BLOCKED   app//edu.wpi.first.wpilibj.DriverStation.updateControlWord(DriverStation.java:1348)   app//edu.wpi.first.wpilibj.DriverStation.getData(DriverStation.java:1232)   app//edu.wpi.first.wpilibj.DriverStation.run(DriverStation.java:1302)   app//edu.wpi.first.wpilibj.DriverStation$DriverStationTask.run(DriverStation.java:73)   java.base@11.0.12.5-frc/java.lang.Thread.run(Unknown Source)       Thread Name:main   Thread State:RUNNABLE   java.base/java.lang.Thread.getStackTrace(Unknown Source)   frc.robot.Robot.printStackTrace(Robot.java:168)   frc.robot.Robot.robotPeriodic(Robot.java:103)   edu.wpi.first.wpilibj.IterativeRobotBase.loopFunc(IterativeRobotBase.java:328)   edu.wpi.first.wpilibj.TimedRobot.startCompetition(TimedRobot.java:131)  Warning  1  teleopPeriodic(): 0.000114s SmartDashboard.updateValues(): 0.000154s robotPeriodic(): 0.067555s LiveWindow.updateValues(): 0.002625s Shuffleboard.update(): 0.000047s  edu.wpi.first.wpilibj.Tracer.lambda$printEpochs$0(Tracer.java:63)   edu.wpi.first.wpilibj.RobotBase.runRobot(RobotBase.java:373)   edu.wpi.first.wpilibj.RobotBase.startRobot(RobotBase.java:463)   frc.robot.Main.main(Main.java:25)       Thread Name:Thread-2   Thread State:RUNNABLE   app//edu.wpi.first.hal.HAL.waitForDSData(Native Method)   app//frc.robot.Robot.lambda$robotInit$1(Robot.java:66)   app//frc.robot.Robot$$Lambda$44/0xaa8d3828.run(Unknown Source)   java.base@11.0.12.5-frc/java.lang.Thread.run(Unknown Source)       Warning at edu.wpi.first.wpilibj.Tracer.lambda$printEpochs$0(Tracer.java:63): teleopPeriodic(): 0.000114s   SmartDashboard.updateValues(): 0.000154s   robotPeriodic(): 0.067555s   LiveWindow.updateValues(): 0.002625s   Shuffleboard.update(): 0.000047s    Warning  44000  DS Disable  Driver Station 

ThadHouse commented 2 years ago

Just to double check, that stack is from after when the joysticks stop responding, right?

eohara0920 commented 2 years ago

Yes, that is correct.

eohara0920 commented 2 years ago

This is when the HAL count number is doing fine and climbing but the DS count number is stuck

ThadHouse commented 2 years ago

So. That group of stacks is EXTREMELY interesting. Specifically this one

 Thread Name:FRCDriverStation   Thread State:BLOCKED   app//edu.wpi.first.wpilibj.DriverStation.updateControlWord(DriverStation.java:1348)   app//edu.wpi.first.wpilibj.DriverStation.getData(DriverStation.java:1232)   app//edu.wpi.first.wpilibj.DriverStation.run(DriverStation.java:1302)   app//edu.wpi.first.wpilibj.DriverStation$DriverStationTask.run(DriverStation.java:73)   java.base@11.0.12.5-frc/java.lang.Thread.run(Unknown Source)    That is waiting on a mutex, which no other thread is waiting on or even near, based on the other stacks. Digging around, it looks like this actually might be a JVM bug with recursive mutexes and synchronized blocks. The code was not recursive in prior years, but was updated in https://github.com/wpilibsuite/allwpilib/pull/3748 to be recursive. We can't be certain, as we don't have a local repro, but based on everything else we've seen it seems likely. I have a PR out (#3922) to change to a known good behavior. Once we get that in, we can provide instructions how to use development builds of WPILib (Its easy), and we can see if that does actually solve the issue.

eohara0920 commented 2 years ago

Thank you for the quick response on this, I'll be ready to test the development build once it's up. Thanks again!

ThadHouse commented 2 years ago

I removed the workaround originally posted here, as 2022.2.1 is officially released. Please use that to fix the issue.

Crossle86 commented 2 years ago

A quick update: installed the development build and joysticks work, robot appears to function normal on a quick test. However, the driver station log still shows a lot of packet loss, though lot less than before. OurCode-devbuild

ThadHouse commented 2 years ago

This bug was completely on the WPILib side, so packet loss wouldn't have been affected. At this point, I wonder if there is something else running on your driver station causing that much packet loss. But glad to hear joysticks are working again. That was very much not what I would have expected the bug to be, especially being a runtime bug.

eohara0920 commented 2 years ago

I will be checking the development build hopefully tomorrow morning, sounds promising. Thanks again!

eohara0920 commented 2 years ago

image

I believe the issue is fixed,been using the robot with a controller for a few restarts and code deploys on the new development build and am not experiencing the issue yet. Thanks again!

sciencewhiz commented 2 years ago

This fix was included in WPILib 2022.2.1, released last night

https://github.com/wpilibsuite/allwpilib/releases/tag/v2022.2.1

PeterJohnson commented 2 years ago

Closing following merge of 3922 and release of 2022.2.1. If you are still seeing this behavior after updating, please reopen the issue.