SwerveRobotics / ftc_app

FTC robot controllers ++
15 stars 38 forks source link

OpMode fails to start #77

Closed nitrocalcite closed 8 years ago

nitrocalcite commented 8 years ago

For some reason, on occasion our OpMode does not start running. After pressing init, waiting up to 1 minute, and then pressing the play button, the app simply sits there, not loading telemetry or performing any other actions. If a stop is attempted during this period, the phones go into the condition where there is a WiFi Direct connection, but the Driver Station is completely grayed-out, and refuses to obey any other commands.

The only solution we know of is to restart both the driver station and robot controller apps. It generally takes 2 to 3 tries to get the OpMode running. Once that happens, the OpMode never shows any problems, so we do not think the problem is a loose wire.

rgatkinson commented 8 years ago

Thanks for the report. Can I ask what release you're currently using? And might it be possible to have a look somewhere at the OpMode you're using?

nitrocalcite commented 8 years ago

We are using the code currently on the master branch here. We tried updating the FTC SDK to 1.6 Beta and that caused it to almost never work - in fact, we didn't move at all at last week's competition because of it.

The OpMode we are using relies entirely on other classes we've written. Under the OpModes folder, it's the one in the TeleOp folder named "TeleOpMaster", using the config file "Nothing" (found in the config folder). It pretty much just executes our TeleOp Helper class. We've added you as a collaborator to our repository so you can take a look.

I did a test with other, vastly simpler OpModes. The others seem to launch fine. Waiting longer for the TeleOpMaster OpMode to start didn't work; it starts quickly or not at all. Another symptom is that even when the OpMode launches, telemetry does not update, and we don't know why.

Thank you!

nitrocalcite commented 8 years ago

This afternoon, this problem randomly disappeared, only to reappear tonight more severely. No amount of restarts gets the OpMode to run. The logs revealed a little information - adb isn't working, sorry there isn't an actual log. After INIT finishes, the robot goes into "running" status. After processing "CMD_REQUEST_OP_MODE_LIST", the "RobotSetupRunnable.run()" thread is terminated. The "EventLoopRunnable.run()" thread then starts. However, then this entry happens: "V/RobotCore( 9984): Swerve: OpModeManager" $Stop$Robot$: stop ..." Some messages about switching to OpMode "Stop Robot" then follow. But, to all outer appearances the OpMode is still TeleOpMaster. Either way, our OpMode code is never executed.

We have a competition this Saturday, so any timely help would be greatly appreciated.

Thank you!

rgatkinson commented 8 years ago

I'd love to have a look, but I didn't see in you post where your code might be. Perhaps a hyperlink got missed somewhere?

rgatkinson commented 8 years ago

Hold that: i think i found it. Team 9791, right?

rgatkinson commented 8 years ago

Having no ADB makes this tricky, as it's really hard to get information off the phone without it.

I assume the neither Android Studio debugger nor the Android Studio "Android Monitor" are functional either (the latter has the log written to it in real time), as they just sit on top of ADB. They're non-functional, correct?

Is there an ADB.EXE in your windows (I'm assuming you're running windows) system folder? If so, delete it (or rename it to adb.old.exe if you are paranoid): the ZTE phone driver installs an old version there; was wondering if that's messing things up.

If ADB still isn't working, let's look in your Android SDK. ADB usually lives in (sdkroot)\platform-tools. It's comprised of adb.exe and two dlls, adbwinapi.dll and adbwinusbapi.dll. Are they both there? If you open a command prompt and go to that directory, and type 'adb devices' while your phone is plugged in over USB, what do you see?

rgatkinson commented 8 years ago

Another thought: is the phone configured for USB debugging correctly? Worth a look-see. Check Settings / Developer Options and look that 'USB debugging' is checked. In Settings / Connect to PC, is the 'Charge only' option selected (and also 'Enable USB debugging')?

rgatkinson commented 8 years ago

If none of that works, then we can try an alternate way of getting things off the phone. Go back to Settings / Connect to PC, and switch from 'Charge Only' to 'Media device'. When you do, Windows should alert you that there's a new device, and might prompt you what you want to do with it. In any case, you should be able to open an Windows File Explorer window and see the phone. If you drill into that, you should be able to see log files in the root. Those would be REALLY useful to have a look at.

rgatkinson commented 8 years ago

I should mention that the log files have names like com.qualcomm.ftcrobotcontroller.*. Just grab all you see.

While you're in there, in the FIRST folder are all the XML files that have your robot configurations. Yours are sufficiently complex that for me to try to reproduce the problem here (in mocked up form) it would be very handy if you could send all those XML files along as well.

rgatkinson commented 8 years ago

Hold the phone: you're using the Swerve Library, but you then went and did what, exactly, involving the FTC HQ 1.6 beta? That's a strong candidate as the root of the problem. The bits currently in the Swerve master branch (Swerve.Library.1.3.1) includes the functionality in the FTC HQ 1.6 beta; if you're using the former, won't need the latter.

rgatkinson commented 8 years ago

I've sync'd to your code, and I've run your TeleOpMaster OpMode. It's definitely starting. My hunch is that there's an exception being thrown somewhere that's causing it to shutdown almost immediately. Without the configuration XML, it's hard for me to track down where, but you might be able to better (isn't debugging fun?).

If you're debugger isn't working (I'll assume not, because you said that ADB was down (but then how are you downloading your code to the phone ???)), then one way to tackle it is a divide and conquer thing. You temporarily go about commenting out more and more parts of your code until the problem disappears, then when that happens, you look at the last chunk you commented more carefully, and move down to a finer grained level.

If you have a functioning debugger, you can find the root exception throwing location more quickly by setting a 'java exception breakpoint' for the kind of exception you're seeing (or just RuntimeException to start, to get most everything that's unexpected), then running your opmode under the debugger and see where it fires.

rgatkinson commented 8 years ago

That's at least a few ideas. Hopefully that will help at least a little more. Let me know what you find out.

nitrocalcite commented 8 years ago

Ok, thanks so much! I'll try those things as soon as possible. The XML config files are included in the repo under /YourCodeHere/config - the correct one is called "Nothing". We downloaded them from a time when ADB worked.

The time when we manually upgraded was a month ago or so. Since then, we have downloaded a zip of this repository, pasted the "YourCodeHere" folder into it, and that is our code currently.

nitrocalcite commented 8 years ago

There is an adb.exe in the windows folder. Renaming it causes adb to not work from the command prompt - adb connect [IP] says that adb is not a valid command.
adb connect is the command that hasn't been working. We download code using the USB cable. All the settings you mentioned regarding debugging are in order. I don't see the platform-tools folder in Android Studio - only bin, gradle, lib, license, and plugins.

Also, when you say "starting" the OpMode, does that mean the telemetry loads? When I say "not starting", I mean the telemetry doesn't load, presumably because as soon as the robot enters the "running" state, the robot controller automatically switches to "Stop Robot" for an unknown reason, without providing any external clues

nitrocalcite commented 8 years ago

Without an actual robot attached, the OpMode is now starting consistently. I don't have the robot with me right now. It's still running the exact same code that wasn't working last night. Yesterday morning, it worked attached to a robot as well as not attached. I haven't changed anything, so I'm _very _confused now.

Looking through the log on the phone, it looks exactly the same as when it didn't work, up until the "Swerve: OpModeManager" message that switches to "Stop Robot" - that message and the ones that follow (detailing the switch to Stop Robot) don't exist.

nitrocalcite commented 8 years ago

Connecting the phone to the computer and running "adb devices" will show the phone as a short alphanumeric string. When adb is working properly, the phone is also shown as an IP address, but that is not happening now.

cmacfarl commented 8 years ago

If you run adb tcpip 5555 followed by adb connect does wireless adb work for you? See http://developer.android.com/tools/help/adb.html#wireless

nitrocalcite commented 8 years ago

The internet here at my school is weird - my computer can connect to it but the phones won't. I'll test that when I get home.
(I'm not cutting class, I'm working in the passing time between classes/my study halls XD)

nitrocalcite commented 8 years ago

The stop message that is being generated is coming from the function on line 23 of SwerveOpModeManager.java, callActiveOpModeStop(). However, Android Studio is very confused. The function is written in yellow text, indicating that it is being used, but if I Alt+F7 it, no usages are returned. I also searched for instances of SwerveOpModeManager, which appears to never call that function.

What triggers cause callActiveOpModeStop() to be called? Clearly it is (or was) being called, but why are there no usages?

Thank you!

cmacfarl commented 8 years ago

Our team brought in an old surplus wifi router someone had sitting at home. We plugged it in and used it as a standalone network. e.g. No connectivity to the wider internet, but a nice little isolated network for our phones. Works well in situations where a school may have some odd policies/policers built into their network.

nitrocalcite commented 8 years ago

Hmm, yea. We run out of my garage - literally, since we are not affiliated with the school.

I can't seem to find any log files on the robot controller. I see the FIRST folder in root, which contains our config files, and an empty subfolder called "OpModeFolder". On the driver station, the only thing in root/FIRST/ is a blank config file. I have hidden files viewing turned on. Can't find anything under /Andriod/Data either

cmacfarl commented 8 years ago

Do you have a /sdcard directory?

nitrocalcite commented 8 years ago

No... Oh duh. Sorry, it's sitting right in root - a 4MB .logcat file. It's attached. On line 35980, there is an instance of the issue I was talking about above. INIT completes, and just as the robot starts running, callActiveOpModeStop() is called by something...

EDIT Here is the excerpt, first line here is 35980:

02-18 21:55:02.013 I/FIRST   ( 9269): ======= INIT FINISH =======
02-18 21:55:02.013 V/RobotCore( 9269): EventLoopManager state is RUNNING
02-18 21:55:02.023 I/FIRST   ( 9269): Robot Status: running
02-18 21:55:02.023 D/Swerve  ( 9269): state xtion: state=RUNNING
02-18 21:55:02.103 V/RobotCore( 9269): thread: ...terminating 'RobotSetupRunnable.run()'
02-18 21:55:02.103 V/RobotCore( 9269): thread: 'EventLoopRunnable.run()' starting...
02-18 21:55:02.163 I/FIRST   ( 9269): Processing Command: CMD_REQUEST_OP_MODE_LIST(10001) 
02-18 21:55:02.223 V/RobotCore( 9269): sending CMD_REQUEST_OP_MODE_LIST_RESP(12), attempt 0
02-18 21:55:02.243 I/FIRST   ( 9269): Processing Command: CMD_INIT_OP_MODE(10005) $Stop$Robot$
02-18 21:55:02.323 V/RobotCore( 9269): sending CMD_INIT_OP_MODE_RESP(34), attempt 0
02-18 21:55:02.613 V/RobotCore( 9269): Swerve: OpModeManager: $Stop$Robot$: stop ...
02-18 21:55:02.613 V/RobotCore( 9269): Swerve: OpModeManager: $Stop$Robot$: ... stop complete
02-18 21:55:02.613 I/RobotCore( 9269): Attempting to switch to op mode $Stop$Robot$
02-18 21:55:02.613 V/RobotCore( 9269): Swerve: OpModeManager: $Stop$Robot$: init...
02-18 21:55:02.613 V/RobotCore( 9269): Swerve: OpModeManager: $Stop$Robot$: ... init complete
02-18 21:55:02.953 D/dalvikvm( 9269): GC_CONCURRENT freed 1867K, 48% free 5769K/11092K, paused 3ms+4ms, total 43ms

After that, a slew of garbage collection messages similar to the last one here follow.

com.qualcomm.ftcrobotcontroller.zip

rgatkinson commented 8 years ago

callActiveOpModeStop() is a method in OpModeManager that OpModeManager uses to invoke the stop() method on an OpMode. It’s called when the system determines that the current OpMode needs to stop ☺, such as when the robot controller receives a command from the driver station instructing it to do so. The Swerve code overrides this method to do some internal bookkeeping, then stops the OpMode as before.

rgatkinson commented 8 years ago

Line 35980 has the sequence below (right?). That’s just the normal default opmode (called $Stop$Robot$ internally) being started up following the completion of the hardware scanning.

02-18 21:55:02.013 I/FIRST ( 9269): ======= INIT FINISH ======= 02-18 21:55:02.013 V/RobotCore( 9269): EventLoopManager state is RUNNING 02-18 21:55:02.023 I/FIRST ( 9269): Robot Status: running 02-18 21:55:02.023 D/Swerve ( 9269): state xtion: state=RUNNING 02-18 21:55:02.103 V/RobotCore( 9269): thread: ...terminating 'RobotSetupRunnable.run()' 02-18 21:55:02.103 V/RobotCore( 9269): thread: 'EventLoopRunnable.run()' starting... 02-18 21:55:02.163 I/FIRST ( 9269): Processing Command: CMD_REQUEST_OP_MODE_LIST(10001) 02-18 21:55:02.223 V/RobotCore( 9269): sending CMD_REQUEST_OP_MODE_LIST_RESP(12), attempt 0 02-18 21:55:02.243 I/FIRST ( 9269): Processing Command: CMD_INIT_OP_MODE(10005) $Stop$Robot$ 02-18 21:55:02.323 V/RobotCore( 9269): sending CMD_INIT_OP_MODE_RESP(34), attempt 0 02-18 21:55:02.613 V/RobotCore( 9269): Swerve: OpModeManager: $Stop$Robot$: stop ... 02-18 21:55:02.613 V/RobotCore( 9269): Swerve: OpModeManager: $Stop$Robot$: ... stop complete 02-18 21:55:02.613 I/RobotCore( 9269): Attempting to switch to op mode $Stop$Robot$ 02-18 21:55:02.613 V/RobotCore( 9269): Swerve: OpModeManager: $Stop$Robot$: init... 02-18 21:55:02.613 V/RobotCore( 9269): Swerve: OpModeManager: $Stop$Robot$: ... init complete

rgatkinson commented 8 years ago

The below is really interesting. What it's showing is TeleOpMaster finishing the result of the INIT button being pressed, then being told to RUN, then, almost immediately, it gets another command, necessarily sent by the driver station (that's the only way they come in), to stop the robot by INIT ializing the $Stop$Robot$ OpMode.

02-19 09:12:18.611 V/RobotCore( 3756): Swerve: OpModeManager: Full TeleOp Control: ... init complete
02-19 09:12:18.611 V/Swerve  ( 3756): gamepad state: #0
02-19 09:12:18.621 D/Swerve  ( 3756): starting OpMode {TeleOpMaster}
02-19 09:12:18.621 D/FTC-9791( 3756): RobotHardware successfully got all motors
02-19 09:12:18.621 V/Swerve:i2cClient(USB AL00VCEI; port: 4)( 3756): doPortIsReadyCallbackBeginWork ...
02-19 09:12:18.621 V/Swerve:i2cClient(USB AL00VCEI; port: 4)( 3756): doModuleIsArmedWork ...
02-19 09:12:18.621 V/Swerve:i2cClient(USB AL00VCEI; port: 4)( 3756): ... doModuleIsArmedWork complete
02-19 09:12:18.621 V/Swerve:i2cClient(USB AL00VCEI; port: 4)( 3756): ... doPortIsReadyCallbackBeginWork complete
02-19 09:12:18.621 V/Swerve:i2cClient(USB AL00VCEI; port: 4)( 3756): hooking ...
02-19 09:12:18.621 V/Swerve:i2cClient(USB AL00VCEI; port: 4)( 3756): ... hooking complete

02-19 09:12:19.321 E/FTC-9791( 3756): Error initializing ExtendedIMU: unexpected i2c device id: 0x00
02-19 09:12:19.321 E/FTC-9791( 3756): ExtendedIMU chassisIMU now operating under soft failure...
02-19 09:12:19.321 V/Swerve:i2cClient(USB AL00VCEI; port: 3)( 3756): doPortIsReadyCallbackBeginWork ...
02-19 09:12:19.321 V/Swerve:i2cClient(USB AL00VCEI; port: 3)( 3756): doModuleIsArmedWork ...
02-19 09:12:19.321 V/Swerve:i2cClient(USB AL00VCEI; port: 3)( 3756): ... doModuleIsArmedWork complete
02-19 09:12:19.321 V/Swerve:i2cClient(USB AL00VCEI; port: 3)( 3756): ... doPortIsReadyCallbackBeginWork complete
02-19 09:12:19.321 V/Swerve:i2cClient(USB AL00VCEI; port: 3)( 3756): hooking ...
02-19 09:12:19.321 V/Swerve:i2cClient(USB AL00VCEI; port: 3)( 3756): ... hooking complete

02-19 09:12:19.511 I/FIRST   ( 3756): Processing Command: CMD_RUN_OP_MODE(10239) Full TeleOp Control
02-19 09:12:19.511 V/RobotCore( 3756): Swerve: OpModeManager: Full TeleOp Control: start...
02-19 09:12:19.531 V/RobotCore( 3756): Swerve: OpModeManager: Full TeleOp Control: ... start complete
02-19 09:12:19.601 V/RobotCore( 3756): sending CMD_RUN_OP_MODE_RESP(265), attempt 0
02-19 09:12:20.021 E/FTC-9791( 3756): Error initializing ExtendedIMU: unexpected i2c device id: 0x00
02-19 09:12:20.021 E/FTC-9791( 3756): ExtendedIMU cannonIMU now operating under soft failure...
02-19 09:12:21.101 I/FIRST   ( 3756): Processing Command: CMD_INIT_OP_MODE(10259) $Stop$Robot$
rgatkinson commented 8 years ago

Curious: aside from hitting the run button, nothing is happening on the driver station when the failures occur, right?

nitrocalcite commented 8 years ago

Yes! The problem you just described in that log is EXACTLY what we are experiencing intermittently!

Yes - to all appearances, the OpMode is running, but nothing actually happens.

rgatkinson commented 8 years ago

Sorry, by 'nothing actually happening' I meant that no one is pushing buttons / bringing up an OpMode list, etc. No USER input, other than joysticks.

nitrocalcite commented 8 years ago

Yes. In the past (a week or so), we looked at the logs and found a shutdown command from the Driver Station appear on the Robot Controller, even though no buttons were pushed on anything, including the joysticks. We just let it sit there. This caused the OpMode to be non-functional, as the robot controller would just shut down the robot, and not move any motors or servos EDIT: even if the joysticks were moved later(nothing actually happening). However, no errors or warnings were produced.

rgatkinson commented 8 years ago

My suspicions are tending towards the driver station code, though I admit to being baffled, as control flow logic really hasn't change there recently. Now i find myself wanting to look at logs from the driver station. Any chance that they might still be available?

nitrocalcite commented 8 years ago

I plugged in my Robot Controller phone to the robot, and the problem reappeared. I'm going to zip my logs and post them, as well as anything I can get off the Driver Station. I'll retry adb (school's over).

nitrocalcite commented 8 years ago

Here is a zip of both logs. The DS log was grabbed ~2 minutes before the RC one was. logs.zip

rgatkinson commented 8 years ago

I'll have a look, though my time today is running short, unfortunately.

Crazy thought: do you have a spare phone you can swap in as your driver station. If not, can you perhaps swap your robot controller and your driver station phones?

If some crazy-@ss hardware issue with the driver station is at the root of this (maybe its touch screen), that might help.

nitrocalcite commented 8 years ago

Ok...We have 3 phones - 2 from the KOP and one the organizers at the first competition graciously lent us trying to troubleshoot the problem. We are using, currently, their phone as the Driver Station and one of our phones as the robot controller.
Here's something perplexing - line 9103 in the DS log, App Stopped:

02-19 15:11:46.629 I/FIRST   ( 3697): Assuming client disconnected
02-19 15:11:46.629 I/FIRST   ( 3697): ui:uiRobotControllerIsDisconnected
02-19 15:11:47.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:48.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:49.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:50.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:51.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:52.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:53.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:54.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:55.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:56.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:57.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:58.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:59.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:11:59.449 V/RobotCore( 3697): There are 0 Wifi Direct Assistant Clients (-)
02-19 15:11:59.449 V/RobotCore( 3697): Disabling Wifi Direct Assistant
02-19 15:11:59.459 V/RobotCore( 3697): Stopping peer discovery
02-19 15:11:59.459 D/RobotCore( 3697): RobocolDatagramSocket is closed
02-19 15:11:59.459 D/RobotCore( 3697): RobocolDatagramSocket receive error: java.net.SocketException: Socket closed
02-19 15:11:59.459 V/RobotCore( 3697): thread: ...terminating 'RecvLoopRunnable.run()'
02-19 15:11:59.459 I/FIRST   ( 3697): App Stopped
02-19 15:12:00.219 E/ActivityThread( 3697): Performing stop of activity that is not resumed: {com.qualcomm.ftcdriverstation/com.qualcomm.ftcdriverstation.FtcDriverStationActivity}
02-19 15:12:00.219 E/ActivityThread( 3697): java.lang.RuntimeException: Performing stop of activity that is not resumed: {com.qualcomm.ftcdriverstation/com.qualcomm.ftcdriverstation.FtcDriverStationActivity}
02-19 15:12:00.219 E/ActivityThread( 3697):     at android.app.ActivityThread.performStopActivityInner(ActivityThread.java:3136)
02-19 15:12:00.219 E/ActivityThread( 3697):     at android.app.ActivityThread.handleStopActivity(ActivityThread.java:3223)
02-19 15:12:00.219 E/ActivityThread( 3697):     at android.app.ActivityThread.access$1100(ActivityThread.java:138)
02-19 15:12:00.219 E/ActivityThread( 3697):     at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1226)
02-19 15:12:00.219 E/ActivityThread( 3697):     at android.os.Handler.dispatchMessage(Handler.java:102)
02-19 15:12:00.219 E/ActivityThread( 3697):     at android.os.Looper.loop(Looper.java:136)
02-19 15:12:00.219 E/ActivityThread( 3697):     at android.app.ActivityThread.main(ActivityThread.java:5095)
02-19 15:12:00.219 E/ActivityThread( 3697):     at java.lang.reflect.Method.invokeNative(Native Method)
02-19 15:12:00.219 E/ActivityThread( 3697):     at java.lang.reflect.Method.invoke(Method.java:515)
02-19 15:12:00.219 E/ActivityThread( 3697):     at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:786)
02-19 15:12:00.219 E/ActivityThread( 3697):     at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:602)
02-19 15:12:00.219 E/ActivityThread( 3697):     at dalvik.system.NativeStart.main(Native Method)
02-19 15:12:00.459 V/RobotCore( 3697): closing logcat file /storage/emulated/0/com.qualcomm.ftcdriverstation
02-19 15:12:00.479 D/dalvikvm( 3697): GC_FOR_ALLOC freed 1050K, 20% free 9847K/12284K, paused 24ms, total 24ms
02-19 15:12:00.489 I/dalvikvm-heap( 3697): Grow heap (frag case) to 12.185MB for 524304-byte allocation
02-19 15:12:00.699 V/RobotCore( 3697): Killing PID 4996
02-19 15:12:00.719 D/dalvikvm( 3697): GC_FOR_ALLOC freed 625K, 24% free 9808K/12800K, paused 15ms, total 15ms
nitrocalcite commented 8 years ago

Thank you for all of your help!! Here is the corresponding log on the RC phone:

02-19 15:11:44.475 V/RobotCore(11806): sending CMD_INIT_OP_MODE_RESP(1753), attempt 10
02-19 15:11:44.585 W/RobotCore(11806): giving up on command CMD_INIT_OP_MODE_RESP(1753) after 11 attempts
02-19 15:11:46.525 D/dalvikvm(11806): GC_CONCURRENT freed 1703K, 42% free 6557K/11136K, paused 4ms+3ms, total 48ms
02-19 15:11:49.715 D/dalvikvm(11806): GC_CONCURRENT freed 1950K, 42% free 6553K/11136K, paused 2ms+4ms, total 56ms
02-19 15:11:52.765 D/dalvikvm(11806): GC_CONCURRENT freed 1937K, 42% free 6557K/11136K, paused 4ms+5ms, total 53ms
02-19 15:11:55.715 D/dalvikvm(11806): GC_CONCURRENT freed 1943K, 42% free 6555K/11136K, paused 3ms+4ms, total 49ms
02-19 15:11:58.835 D/dalvikvm(11806): GC_CONCURRENT freed 1939K, 42% free 6558K/11136K, paused 5ms+4ms, total 48ms
02-19 15:12:02.015 D/dalvikvm(11806): GC_CONCURRENT freed 1941K, 42% free 6553K/11136K, paused 3ms+4ms, total 41ms
02-19 15:12:05.265 D/dalvikvm(11806): GC_CONCURRENT freed 1939K, 42% free 6557K/11136K, paused 3ms+4ms, total 46ms
02-19 15:12:08.565 D/dalvikvm(11806): GC_CONCURRENT freed 1945K, 42% free 6554K/11136K, paused 3ms+2ms, total 46ms
02-19 15:12:11.765 D/dalvikvm(11806): GC_CONCURRENT freed 1940K, 42% free 6556K/11136K, paused 4ms+5ms, total 50ms
02-19 15:12:15.015 D/dalvikvm(11806): GC_CONCURRENT freed 1947K, 42% free 6555K/11136K, paused 4ms+4ms, total 56ms
02-19 15:12:18.115 D/dalvikvm(11806): GC_CONCURRENT freed 1942K, 42% free 6554K/11136K, paused 3ms+5ms, total 47ms
02-19 15:12:21.135 D/dalvikvm(11806): GC_CONCURRENT freed 1940K, 42% free 6555K/11136K, paused 4ms+4ms, total 48ms
02-19 15:12:24.195 D/dalvikvm(11806): GC_CONCURRENT freed 1939K, 42% free 6553K/11136K, paused 3ms+4ms, total 44ms
02-19 15:12:26.795 W/RobotCore(11806): Unable to send RobocolDatagram: java.net.SocketException: Socket is closed
02-19 15:12:26.795 W/RobotCore(11806):                RobocolDatagram - type:NONE, addr:null, size:0
02-19 15:12:26.795 I/RobotCore(11806): Battery Checker, Level Remaining: 91.0
rgatkinson commented 8 years ago

That error, while unfortunate to be there, is long standing and, I believe, unrelated.

nitrocalcite commented 8 years ago

The "App Stopped" message, or the other problem, which is that the RC gave up attempting CMD_INIT_OP_MODE_RESP at 15:11:44, and then at 15:11:46 the DS declared the RC dead?

rgatkinson commented 8 years ago

Here's one of the incidents on the robot controller. Command CMD_INIT_OP_MODE(10193) init'd the opMode (that's earlier in the log); CMD_RUN_OP_MODE(10212) ran it, and the inexplicable CMD_INIT_OP_MODE(10305) caused it to stop. The numbers here in parens are packet serial numbers. Numbers greater than 10000 originate from the driver station.

02-19 13:40:45.658 D/Swerve  (11806): starting OpMode {TeleOpMaster}
02-19 13:40:45.658 V/Swerve  (11806): gamepad state: #0
02-19 13:40:45.658 D/FTC-9791(11806): RobotHardware successfully got all motors
02-19 13:40:45.658 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): doPortIsReadyCallbackBeginWork ...
02-19 13:40:45.658 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): doModuleIsArmedWork ...
02-19 13:40:45.658 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): ... doModuleIsArmedWork complete
02-19 13:40:45.658 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): ... doPortIsReadyCallbackBeginWork complete
02-19 13:40:45.658 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): hooking ...
02-19 13:40:45.658 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): ... hooking complete
02-19 13:40:46.208 D/dalvikvm(11806): GC_CONCURRENT freed 1927K, 56% free 4941K/11136K, paused 3ms+3ms, total 40ms
02-19 13:40:46.358 E/FTC-9791(11806): Error initializing ExtendedIMU: unexpected i2c device id: 0x00
02-19 13:40:46.358 E/FTC-9791(11806): ExtendedIMU chassisIMU now operating under soft failure...
02-19 13:40:46.358 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): doPortIsReadyCallbackBeginWork ...
02-19 13:40:46.358 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): doModuleIsArmedWork ...
02-19 13:40:46.358 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): ... doModuleIsArmedWork complete
02-19 13:40:46.358 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): ... doPortIsReadyCallbackBeginWork complete
02-19 13:40:46.358 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): hooking ...
02-19 13:40:46.358 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): ... hooking complete
02-19 13:40:46.808 I/FIRST   (11806): Processing Command: CMD_RUN_OP_MODE(10212) Full TeleOp Control
02-19 13:40:46.818 V/RobotCore(11806): Swerve: OpModeManager: Full TeleOp Control: start...
02-19 13:40:46.818 V/RobotCore(11806): Swerve: OpModeManager: Full TeleOp Control: ... start complete
02-19 13:40:46.818 V/RobotCore(11806): sending CMD_RUN_OP_MODE_RESP(842), attempt 0
02-19 13:40:47.058 E/FTC-9791(11806): Error initializing ExtendedIMU: unexpected i2c device id: 0x00
02-19 13:40:47.058 E/FTC-9791(11806): ExtendedIMU cannonIMU now operating under soft failure...
02-19 13:40:49.068 D/dalvikvm(11806): GC_CONCURRENT freed 1945K, 56% free 4928K/11136K, paused 2ms+1ms, total 16ms
02-19 13:40:51.908 D/dalvikvm(11806): GC_CONCURRENT freed 1933K, 56% free 4932K/11136K, paused 2ms+3ms, total 20ms
02-19 13:40:54.758 D/dalvikvm(11806): GC_CONCURRENT freed 1937K, 56% free 4928K/11136K, paused 3ms+2ms, total 27ms
02-19 13:40:56.408 I/FIRST   (11806): Processing Command: CMD_INIT_OP_MODE(10305) $Stop$Robot$
02-19 13:40:56.408 V/RobotCore(11806): Swerve: OpModeManager: Full TeleOp Control: stop ...
02-19 13:40:56.408 D/Swerve  (11806): stopping OpMode {TeleOpMaster}...
rgatkinson commented 8 years ago

The corresponding part of the driver station log is the following:

02-19 13:40:45.155 I/FIRST   ( 3697):     sending CMD_INIT_OP_MODE(10193), attempt: 0
02-19 13:40:45.165 I/FIRST   ( 3697): RC battery Telemetry event: 98.0
02-19 13:40:45.165 I/FIRST   ( 3697): Robot Battery Telemetry event: $no$voltage$sensor$
02-19 13:40:45.195 I/FIRST   ( 3697):  processing command: CMD_INIT_OP_MODE_RESP(815)
02-19 13:40:45.195 I/FIRST   ( 3697): Robot Controller initializing op mode: Full TeleOp Control
02-19 13:40:45.195 I/FIRST   ( 3697): ui:uiWaitingForStartEvent
02-19 13:40:45.195 E/FIRST   ( 3697): ### ERROR: System telemetry warning: unable to open Modern Robotics USB Servo Controller [AL00XLBY]; unable to open Modern Robotics USB DC Motor Controller [AL00VXRD]; unable to open Modern Robotics USB DC Motor Controller [AL00VXQQ]; unable to open Modern Robotics USB DC Motor Controller [AL00YCG8]; unable to open Modern Robotics USB Core Device Interface Module [AL00VCEI]; unable to open Modern Robotics USB DC Motor Controller [AL00VW3U]
02-19 13:40:45.315 E/FIRST   ( 3697): ### ERROR: System telemetry warning: unable to open Modern Robotics USB Servo Controller [AL00XLBY]; unable to open Modern Robotics USB DC Motor Controller [AL00YCG8]; unable to open Modern Robotics USB DC Motor Controller [AL00VXQQ]; unable to open Modern Robotics USB DC Motor Controller [AL00VXRD]; unable to open Modern Robotics USB DC Motor Controller [AL00VW3U]; unable to open Modern Robotics USB Core Device Interface Module [AL00VCEI]
02-19 13:40:46.135 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:46.475 I/FIRST   ( 3697): ui:uiWaitingForStopEvent
02-19 13:40:46.475 I/FIRST   ( 3697):     sending CMD_RUN_OP_MODE(10212), attempt: 0
02-19 13:40:46.485 I/FIRST   ( 3697): RC battery Telemetry event: 98.0
02-19 13:40:46.485 I/FIRST   ( 3697): Robot Battery Telemetry event: $no$voltage$sensor$
02-19 13:40:46.495 I/FIRST   ( 3697):  processing command: CMD_RUN_OP_MODE_RESP(842)
02-19 13:40:46.495 I/FIRST   ( 3697): Robot Controller starting op mode: Full TeleOp Control
02-19 13:40:46.495 I/FIRST   ( 3697): ui:uiWaitingForStopEvent
02-19 13:40:47.135 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:48.135 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:49.135 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:50.135 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:51.135 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:52.145 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:53.145 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:54.145 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:55.145 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 13:40:56.045 I/FIRST   ( 3697): ui:uiWaitingForInitEvent
02-19 13:40:56.075 I/FIRST   ( 3697):     sending CMD_INIT_OP_MODE(10305), attempt: 0
rgatkinson commented 8 years ago

Hmmm...that was a lengthy one, no immediate. Maybe it was a successful one that was manually stopped? But you see how I'm trying to track things down, I hope.

rgatkinson commented 8 years ago

Here's another pair:

02-19 15:10:40.575 D/Swerve  (11806): starting OpMode {TeleOpMaster}
02-19 15:10:40.575 D/FTC-9791(11806): RobotHardware successfully got all motors
02-19 15:10:40.575 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): doPortIsReadyCallbackBeginWork ...
02-19 15:10:40.575 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): doModuleIsArmedWork ...
02-19 15:10:40.575 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): ... doModuleIsArmedWork complete
02-19 15:10:40.575 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): ... doPortIsReadyCallbackBeginWork complete
02-19 15:10:40.575 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): hooking ...
02-19 15:10:40.575 V/Swerve:i2cClient(USB AL00VCEI; port: 4)(11806): ... hooking complete
02-19 15:10:40.855 D/dalvikvm(11806): GC_CONCURRENT freed 1926K, 54% free 5131K/11136K, paused 4ms+3ms, total 37ms
02-19 15:10:41.275 E/FTC-9791(11806): Error initializing ExtendedIMU: unexpected i2c device id: 0x00
02-19 15:10:41.275 E/FTC-9791(11806): ExtendedIMU chassisIMU now operating under soft failure...
02-19 15:10:41.275 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): doPortIsReadyCallbackBeginWork ...
02-19 15:10:41.275 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): doModuleIsArmedWork ...
02-19 15:10:41.275 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): ... doModuleIsArmedWork complete
02-19 15:10:41.275 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): ... doPortIsReadyCallbackBeginWork complete
02-19 15:10:41.275 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): hooking ...
02-19 15:10:41.275 V/Swerve:i2cClient(USB AL00VCEI; port: 3)(11806): ... hooking complete
02-19 15:10:41.975 E/FTC-9791(11806): Error initializing ExtendedIMU: unexpected i2c device id: 0x00
02-19 15:10:41.975 E/FTC-9791(11806): ExtendedIMU cannonIMU now operating under soft failure...
02-19 15:10:42.565 I/FIRST   (11806): Processing Command: CMD_RUN_OP_MODE(10110) Full TeleOp Control
02-19 15:10:42.565 V/RobotCore(11806): Swerve: OpModeManager: Full TeleOp Control: start...
02-19 15:10:42.565 V/RobotCore(11806): Swerve: OpModeManager: Full TeleOp Control: ... start complete
02-19 15:10:42.595 V/RobotCore(11806): sending CMD_RUN_OP_MODE_RESP(1198), attempt 0
02-19 15:10:43.445 D/dalvikvm(11806): GC_CONCURRENT freed 1951K, 55% free 5119K/11136K, paused 3ms+1ms, total 29ms
02-19 15:10:45.275 W/RobotCore(11806): Unable to send RobocolDatagram: java.net.SocketException: Socket is closed
02-19 15:10:45.275 W/RobotCore(11806):                RobocolDatagram - type:NONE, addr:null, size:0
02-19 15:10:45.275 I/RobotCore(11806): Battery Checker, Level Remaining: 91.0
02-19 15:10:45.535 I/FIRST   (11806): Processing Command: CMD_INIT_OP_MODE(10142) $Stop$Robot$
02-19 15:10:45.535 V/RobotCore(11806): Swerve: OpModeManager: Full TeleOp Control: stop ...
02-19 15:10:45.535 D/Swerve  (11806): stopping OpMode {TeleOpMaster}...
02-19 15:10:45.535 D/Swerve  (11806): ...stopped
02-19 15:10:42.569 I/FIRST   ( 3697): ui:uiWaitingForStopEvent
02-19 15:10:42.589 I/FIRST   ( 3697):     sending CMD_RUN_OP_MODE(10110), attempt: 0
02-19 15:10:42.659 I/FIRST   ( 3697): RC battery Telemetry event: 91.0
02-19 15:10:42.659 I/FIRST   ( 3697): Robot Battery Telemetry event: $no$voltage$sensor$
02-19 15:10:42.759 I/FIRST   ( 3697):  processing command: CMD_RUN_OP_MODE_RESP(1198)
02-19 15:10:42.759 I/FIRST   ( 3697): Robot Controller starting op mode: Full TeleOp Control
02-19 15:10:42.759 I/FIRST   ( 3697): ui:uiWaitingForStopEvent
02-19 15:10:43.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:10:44.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:10:45.139 V/RobotCore( 3697): sending peer discovery packet(10000)
02-19 15:10:45.539 I/FIRST   ( 3697): ui:uiWaitingForInitEvent
02-19 15:10:45.549 I/FIRST   ( 3697):     sending CMD_INIT_OP_MODE(10142), attempt: 0
nitrocalcite commented 8 years ago

I think I understand. You are finding the log line detailing the spurious command, and then looking at the context in both DS and RC logs, correct?

rgatkinson commented 8 years ago

Yes. And I'm identifying in the robot controller log which packet number of a CMD_INIT_OP_MODE(nnnn) $Stop$Robot$ is the culprit, then looking for that being sent in the driver station log.

rgatkinson commented 8 years ago

Unfortunately, and I am sorry, but I'm REALLY running out of time here for the next six hours or so.

rgatkinson commented 8 years ago

Another thought: can you take your code back to the last system that you didn't have problems with and just go with that?

nitrocalcite commented 8 years ago

Ok, thank you for all your help! I'm talking to the other programmer on my team, we'll try to continue.

We aren't sure when this problem started. Previous versions going back a month or so are even worse, with TONS of random errors that disappeared when we got a fresh copy of this repo. We could try, at loss of functionality, or we could start creating simple test opmodes, and see if they have the same issues. If the issue is a random command though, that shouldn't matter...

Is this an underlying issue with the SDK? Should I open it over there?

rgatkinson commented 8 years ago

No need to open it over there; they already know about it :-) Thanks though.

nitrocalcite commented 8 years ago

We found an issue with initializing our IMUs, as well as inconsistencies within the .aars between our repo and this repo. For some reason the hashes of our .aars and the .aars here didn't match, so we fixed that. Then, there was something hanging inside the IMU initialization code: when we removed the IMUs, the problem disappeared. I'm not entirely sure of the details; I'll have to defer to @nrizzio for that one. It still doesn't explain why there was a shutdown command from the driver station, or the similarly inexplicable motor failure that happened at our competition. Several of our motors just randomly did not move, until they equally randomly turned on again. We don't have any logs of that, however, we will address these reliability issues in the offseason (now).

nrizzio commented 8 years ago

As far as I could tell, the issue with the driver station sending a stop command for no reason was caused by the mismatch of the .aar files, so it clearly isn't a problem with the Swerve library. I haven't been able to isolate the possible IMU initialization issue, so I'm refraining from opening any bug about it until I'm sure I know exactly where the issue lives.