commaai / openpilot

openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system in 275+ supported cars.
https://comma.ai/openpilot
MIT License
48.82k stars 8.88k forks source link

Honda Nidec: camera fault caused PCM to fault during AEB #27961

Open rjsmith1999 opened 1 year ago

rjsmith1999 commented 1 year ago

Describe the bug

Issue seemed to start after I let openpilot brake to zero. I temporarily saw a warning on my comma 3, not sure what it was as I took over pretty quickly.

Everything seemed fine after that, but the “acc” indicator in my dash turned orange.

I drove for a while and then re-engaged openpilot. All seemed fine for a while. Some point later openpilot gave an emergency warning: cruse not enabled I think.

I took over, and continued driving. acc indicator was still orange, but I could no longer engage openpilot (nothing happened when I pressed set) (pcm stopped sending enable messages?)

At a stoplight, I restarted the car a few times. acc remained orange. A few other messages showed up, but went away quickly, I think after my comma 3 went onroad. Those messages were “RDM unavailable” or something, “acc unavailable” and a ”brake system” orange indicator..

After driving for a while, the acc indicator turned green and I could use openpilot like normal.

Which car does this affect?

Honda Passport Elite 2021

Provide a route where the issue occurs

Will add routes when they finish uploading…

Relevant routes are:

3848aeae9b19e2c9|2023-04-18--17-45-10

3848aeae9b19e2c9|2023-04-18--18-00-40

3848aeae9b19e2c9|2023-04-18--18-01-00

openpilot version

6f35d23c062ab000af90f5a0d7704eab8682479d

Additional info

This drive was on my own repo that’s one (personal) commit ahead of the hash I pasted above. Sorry if that complicates things. Can try to reproduce on master.


My take on the situation, which could be way off. It seems like braking to zero caused a fault related to the abs brake request. Maybe we can we can fix this, maybe it’s unavoidable…

At the very least, I think openpilot’s warning behavior could be improved in this situation.

Also, maybe if we’re in a fault situation like this where the pcm isn’t sending enable messages, we could read the button messages and send an openpilot unavailable alert (I think that’s possible?)

also.. no idea why I could engage once after the acc fault, but not subsequent times. 🤷

sshane commented 1 year ago

Stock AEB activated so we blocked our own braking messages and started forwarding the camera's. I searched for a fault bit when you weren't able to enable after this, but I found nothing that matched very well. There's a bug however, openpilot didn't consider it AEB but panda was blocking our messages.

image

sshane commented 1 year ago

I found routes where similar events have happened with other cars and they did not fault like you describe, so this could be because when the AEB stopped, openpilot was requesting some brake and that could have violated some rate limit. There's a bit that goes high after this in 0x335, but it doesn't exactly match, so I'll try to reproduce the bit on our Nidec and get back to you.

Do you remember any alerts that started almost exactly 10 seconds later? That's when this bit rose.

rjsmith1999 commented 1 year ago

Stock AEB activated so we blocked our own braking messages and started forwarding the camera's. I searched for a fault bit when you weren't able to enable after this, but I found nothing that matched very well. There's a bug however, openpilot didn't consider it AEB but panda was blocking our messages.

Got it. So it sounds like the stock AEB canceled cruse with the PCM hence the "CRUSE NOT ENABLED" message? But AEB was short enough that openpilot didn't recognize it?

Does AEB typically cancel cruise? If I'm reading your plots right it looks like cruiseState/enabled goes low at the same time AEB begins.

I found routes where similar events have happened with other cars and they did not fault like you describe, so this could be because when the AEB stopped, openpilot was requesting some brake and that could have violated some rate limit. There's a bit that goes high after this in 0x335, but it doesn't exactly match, so I'll try to reproduce the bit on our Nidec and get back to you.

Do we know what ECU/Message is associated with 0x335? I guess the answer is no, because I don't see this id in opendbc.

Does anything interesting happen with this bit in the final drive, when I can enable again? (I would try to take a look, but openpilot doesn't build on my Mac right now :)

Do you remember any alerts that started almost exactly 10 seconds later? That's when this bit rose.

You're asking about 10 sections after the stock AEB/disengage? I don't remember anything changing on the dash.

Everything was normal besides the orange acc indicator and those three warnings that showed up when I restarted.

sshane commented 1 year ago

Oh I see what could have caused this! The camera got into a faulted state and was setting its fault bits, but since we control longitudinal, we block those bits to the car. Once the AEB event happened, panda blocked openpilot's message and started forwarding the camera's message with the faulted signals, which the PCM or brake controller might've seen and started blocking subsequent engagement. It's just strange that the PCM doesn't have any signals of its own for this, but not unlikely, if it's a camera-driven state.

image

PR to start logging these cases where the camera is faulted: https://github.com/commaai/openpilot/pull/28338