KevinOConnor / can2040

Software CAN bus implementation for rp2040 micro-controllers
GNU General Public License v3.0
636 stars 63 forks source link

Recovery when an error occurs #35

Closed furusawata closed 1 year ago

furusawata commented 1 year ago

Hello Kevin

I am writing a program using can2040. An error is intentionally generated at the time of transmission and reception, and the behavior is confirmed. If an error occurs during transmission, recovery is not possible.

I also tested it with Klipper. cangen can0 -g 2 -e -I 00EF0102 -L 8 -D 0000000000000000 It stops when an error is generated on purpose.

scope_17 The command stops and prints the following comment: write: No buffer space available After this I can't restore without unplugging and plugging the USB.

Can I restart with can2040 software?

KevinOConnor commented 1 year ago

I'm not sure what you are reporting. If there is a situation where can2040 is not doing the right thing, I'd need to be able to setup a test environment here to reproduce the problem. Can you describe your test hardware and the steps needed to reproduce the problem?

-Kevin

furusawata commented 1 year ago

The following configurations have been tested.

system

Packets are sent from A to C or B to C.

The following is the waveform under normal conditions.

scope_20

PCAN-USB can error on any bit.

PCAN

When an error is made from B to C, it is resent.

scope_21

However, it does not resend when an error is generated from A to C.

scope_19

Is there a problem with my can2040 usage? Is there a way for you to raise the error? Is there anything else to check?

KevinOConnor commented 1 year ago

Okay I think I understand. You are reporting that if another node sends an error frame while can2040 is transmitting then can2040 stops functioning.

I don't have a hardware setup readily available to reproduce this. However, I've seen lots of cases where can2040 correctly responds to an error frame. So, this sounds like a regression or something specific to the timing of the error frame.

What version of can2040 are you using?

Have you tried running an older version of can2040 to see if it has the same behavior (for example the v1.3.0 tag or v0.2.0 tag of can2040)?

Does injecting the error frame at a different bit location (eg, bit 20 instead of 60) alter the outcome of the test?

I'll see if I can setup a test environment locally.

-Kevin

furusawata commented 1 year ago

I tried changing the bit location. Bit101 and below were not resent, and bit102 and above were resent.

bit_101_102

There was no difference between versions.

Regards,

KevinOConnor commented 1 year ago

You have very nice test equipment.

I was able to reproduce this problem locally. It appears the can2040 code has a defect in how it detects "tx conflict" conditions.

I have a proposed fix at #36.

Thank you for reporting this issue. -Kevin

furusawata commented 1 year ago

Thanks for the fix. Confirmed to resend.

Could you also fix work-txtiming-20230313(#32)?

KevinOConnor commented 1 year ago

Great.

Sure, I rebased #32 on top of #36.

-Kevin

furusawata commented 1 year ago

Thank you!

KevinOConnor commented 1 year ago

I committed PR #36.

-Kevin