Expanding nisprog to CAN?

fenugrec / nisprog

nisprog Nissan ECU reflashing tool

GNU General Public License v3.0

76 stars 17 forks source link

Expanding nisprog to CAN? #20

Open rimwall opened 2 years ago

rimwall commented 2 years ago

If you ever find cheap J2534 hardware that supports K+CAN, that might motivate me to look at j2534 support, but it's still a stretch. I find it difficult to justify spending tons of free time on this project.

I have been working through the ROM on the CAN comms for kernel upload and RAM Jump, so am ready to see if I can communicate with the ECU over CAN. I've been looking into options to expand nisprog to CAN. It will be a steep learning curve, but I'm hoping I can do the bulk of the work. Interested in your thoughts on these options...?

Option 1: Expand on the ELM327 capability already in freediag. The company ELM shuts down end of this month, so I imagine only clones will be available. Advantages: cheap (clone cables are ~$20), nisprog already has some elm code. Disadvantages: variability of ELM clones, CAN compliant only, not full J2534 (although only CAN is required to communicate with the ECU)

Option 2: Expand on the ELM327 capability already in freediag with the target hardware being STTN2230 based https://www.scantool.net/scantool/downloads/98/stn1100-frpm.pdf. Apparently compatible with ELM327. Advantages: relatively cheap (OBDLink SX USB to OBD cable is ~$50), possibly the same coding in nisprog as required for Option 1 meaning it could work for ELM clones (unreliably) and STTN2230 (reliably). Disadvantages: CAN compliant only, not full J2534 (although only CAN is required to communicate with the ECU)

Option 3: Utilise J2534 code from OpenVehicleDiag https://github.com/rnd-ash/Macchina-J2534 in combination with an open-source M2 https://www.macchina.cc/catalog. Advantages: J2534 compliant (mostly?). Disadvantages: no familiarity with rust (for me), CAN compliance is "partial", more expensive than above options but cheaper than most J2534 solutions, integration with nisprog may be difficult(?)

fenugrec commented 2 years ago

Looks like again, things would be easier on linux, thanks to socketcan, and this recently merged ISO-TP support : https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e057dd3fc20ffb3d7f150af46542a51b59b90127

socketcan already supports a lot of different CAN hardware natively. But there is nothing even remotely similar on win* .

Interesting that the ELM327 would be discontinued. That's another nail in the coffin as far as I'm concerned; you would be entirely on your own to work and test that. For fun, go look at what was required just for iso9141/14230 : https://github.com/fenugrec/freediag/blob/master/scantool/diag_l0_elm.c Anything with a text-based interface like that is absolutely horrible to program for, especially when input/output formatting varies (and it does). If you aren't disgusted, then you're braver than me.

I'm vaguely familiar with macchina; if it does provide a J2534 API, then one doesn't need to know rust at all ? (assuming they produce a DLL or some kind of C bindings)

When I was working on nisprog more intensively, what I had in mind for future development was for part of freediag to become a J2534 driver (i.e. use a cheap USB-K line cable or what have you, and provide a J2534 API), and rewrite nisprog to use J2534 as a backend - either provided by this new freediag-j2534 layer, or any other commercial j2534 hardware and their driver.

As I understand, reflashing a CAN ECU (whether Nis or Sub) is quite a different process vs K-line , then there may be no advantage to reusing nisprog ? i.e. on Nissan CAN doesn't need a kernel, and Romraider apparently now can do it (with J2534 hardware). Plus, it's not like the CLI interface is a very popular feature : )

rimwall commented 2 years ago

reflashing a CAN ECU (whether Nis or Sub) is quite a different process vs K-line

After a brief glance, it looks similar. Here is the key part of the RR code from J2534NCS (for Nissan) which, as you say, relies on a J2534 cable. There appears to be no reliance on a kernel.

                                // Start programming
                                try {
                                    uds1085(channelId);
                                    uds2781(channelId);
                                    uds2782(channelId);
                                    if (!test) {
                                        uds3181(channelId);
                                        for (byte[] chunk : prgChunks) {
                                            uds34(channelId, chunk);
                                        }
                                        uds3182(channelId);
                                    }
                                    else {
                                        System.out.println("If not a Test, programming messages would appear here ...");
                                    }
                                    //String uds3483_t = "3132333435014106043954524B454D564E32000000000080010203040544414243444E4F4445780000010110115CFFFF";
                                    //String uds3483_t = "30303030303041060431434D43375150443430310000000000000000005230303030434F4E332B0619022000005CAD04"; // CRC = 312C
                                    String uds3483_t = "ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff"; // !~CRC = B9BC
                                    uds3483_d = HexUtil.asBytes(uds3483_t);
                                    final ByteBuffer uds3483_bb = ByteBuffer.allocate(uds3483_d.length + 6).order(ByteOrder.BIG_ENDIAN);
                                    uds3483_bb.putShort((short) 0x3483);
                                    uds3483_bb.putInt(0x00000030);
                                    uds3483_bb.put(uds3483_d);
                                    //System.out.println(HexUtil.asHex(uds3483_bb.array()));
                                    short crc = codec.calcCrc(uds3483_bb.array());
                                    short invert = (short) ~crc;
                                    short flip = (short) (((invert << 8) | ((invert & 0xffff) >>> 8)) & 0xffff);
                                    final ByteBuffer uds3483_data = ByteBuffer.allocate(uds3483_bb.capacity() + 2).order(ByteOrder.BIG_ENDIAN);
                                    uds3483_data.put(uds3483_bb.array());
                                    uds3483_data.putShort(flip);
                                    if (test) System.out.println(String.format("UDS 3483 data: %s",
                                            HexUtil.asHex(uds3483_data.array())));
                                    if (!test) uds3483(channelId, uds3483_data.array());

The Subaru ROM I'm looking at has a CAN method that appears to be very similar to the K-Line approach, it's just that data goes in/out via CAN MBs. And a kernel is still required.

I stumbled across WSL. Maybe that is a way to get a SocketCAN based solution working on Windows... ? https://www.reddit.com/r/CarHacking/comments/ot3gjf/socketcancanutils_on_windows/ and https://github.com/microsoft/WSL/issues/5533

i.e. use a cheap USB-K line cable or what have you, and provide a J2534 API

My knowledge in this space is pretty thin, but is it actually possible to use a USB-K Line cable to do CAN comms? I had assumed the physical differences (eg) voltage levels would make it 'impossible'?

fenugrec commented 2 years ago

The Subaru ROM I'm looking at has a CAN method that appears to be very similar to the K-Line approach [...] And a kernel is still required.

Hm interesting. You're certain it's not just copying some of its builtin funcs to RAM then jumping to them ?

I stumbled across WSL. Maybe that is a way to get a SocketCAN based solution working on Windows... ?

It sounded like it was incomplete (no USB-CAN support) and/or very experimental ? I wouldn't know, I'm not on win* anymore. Either that or a regular VM and minimal linux distro might work... that means a lot of downloading for the initial install, but most people maybe wouldn't care.

i.e. use a cheap USB-K line cable or what have you, and provide a J2534 API My knowledge in this space is pretty thin, but is it actually possible to use a USB-K Line cable to do CAN comms?

No sorry, I was just talking about what I would have used for regular K-line ECUs . Until recently I never really considered adding CAN reflash to nisprog. But there is some J2534 hardware that does both K and CAN (same API). "Like the one I never finished designing"... sigh

I guess my points are :

for Nissan CAN ECUs , there's not much reason : -- can be done from RR now -- no kernel required and steps are very different : very little code would be shared
- for Subaru CAN ECUs : ok, if somehow we can use mostly the same kernel (doubtful), still : -- either need to add CAN support in freediag, if we want to use mostly the same diagl2* API (not sure if realistic) -- a lot of code in nisprog still assumes a K-line connection, like speeds, timeouts, fastdump, etc.

None of this is insurmountable but there are no easy solutions. And for some of them, lke I said, you would be on your own - ELM I am not touching; anything on WSL / native win* I am unable to test.

rimwall commented 2 years ago

After some digging and thinking, I think I will have a go using this fully open source option: https://fischl.de/usbtin/. It's nice that it's fully open source... USBtin support might also be a future development for RR which would then allow Nissan CAN flashing without an (expensive) J2534 cable.

You're certain it's not just copying some of its builtin funcs to RAM then jumping to them ?

Yes, it still requires a kernel. There is none of the encryption / decryption, but it retains the basic process of load a kernel into RAM, check the checksum, and then jump into the kernel.

Step 1 will be to write java code for USBtin to establish CAN comms with the ECU and enter the CAN bootloader. Step 2 will be to rejig npkern RX/TX for CAN comms. Pretty sure I'll be doing lots of recycle in Step 1 and 2!

fenugrec commented 2 years ago

Step 1 will be to write java code for USBtin to establish CAN comms with the ECU

I admire your bravery - not much of a java guy myself.

rejig npkern RX/TX for CAN comms

The main challenge IMO will be either implementing a minimal subset of iso15765 / iso-TP to be able to transfer multi-frame packets. Either that, or at least hopefully another existing simple protocol; else inventing a custom protocol with sequencing+reconstruction etc (gross).

cmd_parser.c is fairly hardcoded for serial comms but some of it could be split out - probably move all the command handlers and message / checksum builders to common code, and have entirely separate "comms parsers" for SCI or CAN - there's just too many differences to handle in one generic tree.

rimwall commented 2 years ago

not much of a java guy

Sadly neither am I. There are some open-source C# or python libraries shown on the USBtin webpage that others have done. But I thought I'd give java a go seeing as it might help with any future integration with RR. I'm open to other approaches.... bravery and stupidity are often two sides of the same coin!

implementing a minimal subset of iso15765 / iso-TP

It looks like the SH705x does most of the low level protocol stuff itself. The ECU code simply copies data to the relevant CAN mailbox and sets a register to TX the message (and the converse to RX a message). But, yes, a multi-frame outer loop of some kind will be required.

fenugrec commented 2 years ago

it might help with any future integration with RR

Talk to dschultz before writing any java ... he should be able to direct you to the right areas of RR to look at - if he hasn't already secretely started working on usbtin support...

It looks like the SH705x does most of the low level protocol stuff itself.

Well of course it gives you mailboxes, hardware filters etc, but you still need to manage frame ordering; bus off , no-ack and other error conditions ... the HCAN peripheral is fairly complex but all you're going to get from it is still just raw 8-byte frames.

rimwall commented 1 year ago

A little while in the making, but fully open-source CAN access to Subaru ECUs should now be possible using the open-source USBtin. See https://github.com/rimwall/USBtinSubProg/tree/USBtinSubProg_test and https://github.com/rimwall/npkern/tree/ssm_can_test

I have successfully done test dumps and writes to my bench ECU (Subaru 7058). Looking at ROM dumps from 2005 through to 2019, this should work for most (all?) Subaru CAN ECUs.

The kernel is derived from npkern (full credit to you). CAN comms are much more limited - I have cut down the scope of the comms to stumble across the line for providing minimal dump/flash capability. Almost all the changes are in cmd_parser.c

A couple of things I wonder:

Do Nissan ECUs have similar CAN bootloaders? I have seen this comment a couple of times on the forums, but I couldn't confirm it by looking in a Nissan ROM dump. This particular resource would suggest there is a common Denso CAN bootloader: https://tuning.energy/pcmflash-42-denso-sh705x-bootloader/
Happy to hear any feedback on the code before I 'release' it on RR. I have managed around timing conflicts via rather long delays in the kernel and SubProg. Not the best solution perhaps, but speed was not my priority. I'm a complete novice with Java so SubProg is probably rather pedestrian. But it may all get rewritten if it gets absorbed into RR.

fenugrec commented 1 year ago

Good work.

I've only skimmed the code so far, I'll have more questions later, but for a start

you're only checking mailbox 0 - have you disabled the other ones to ensure everything goes into 0 ? Or something I may have missed; not super familiar with the HCAN periph
have you tested this in a car which may have a much busier CAN bus
how does it behave if you disconnect the ECU from the CAN bus and reconnect ? A lot could happen, the comms could stall, HCAN could get stuck in Bus Off mode, etc...

rimwall commented 1 year ago

you're only checking mailbox 0 - have you disabled the other ones to ensure everything goes into 0 ?

The way the HCAN periph works is that you assign a CAN ID to a mailbox. Having done that, all messages with that ID are sent to that mailbox. During initialization, the ROM sets up only two IDs - CAN ID 0x21 on MB1_0 for TX/RX and CAN ID 0xFFFFE on MB0_0 for RX only. The ROM disables all others, although the other MBs wouldn't get involved unless they somehow got assigned the same CAN IDs.

have you tested this in a car which may have a much busier CAN bus

Not yet, that's next on the list, I'm just waiting for another OBD plug and pigtail to arrive. Although I don't think it will be a problem. The entry command uses CAN ID 0xFFFFE which will override every other CAN ID. And once in the bootloader, all normal CAN traffic stops. I can see this on my bench ECU because it pumps out many CAN messages prior to bootloader entry, but nil thereafter.

how does it behave if you disconnect the ECU from the CAN bus and reconnect ?

Hmmm, good question, I will try it and see. The Java error handling is currently rudimentary, this whole 'Exception' thing is new to me. One problem is that the 'catch' block can't seem to access the instance of the USBtin struct to shut down the connections. And if I try to make the instance of the USBtin struct global (like the other globals), the Java compiler complains. So I have to physically disconnect / reconnect at the USB port whenever there is an Exception. My hope is this will all get improved when integrated into RR.

rimwall commented 1 year ago

Tested it in a 2016 Subaru by simply poking some wires into the OBD connector. Suffered from losing some packets, perhaps because wiring to the ECU is now longer or perhaps the wires provide a poor connection. Could still load the kernel by slightly increasing the wait per packet, but kept losing a few packets in a 1MB ROM dump, even with the kernel idling longer between packets. I am going to try doing the data transfer the same way the flash loading is done (using pure data packets), and may also need to break it into smaller chunks and test checksum after each chunk.

The good news is the kernel load / run process worked just fine.

rimwall commented 1 year ago

1MB dump successful in a car. Had to break it down into 256 byte chunks and repeat chunks with missing packets. There was about 12 missed packets in 0.5MB. Not sure what causes the missed packets. I have no missed packets when dumping on the bench, so it must be something to do with an 'in car' situation. I guess the ECU is quiet (aside from the kernel) so it may be due to all the other components on the CAN bus (eg: ABS, TCU, displays etc). The kernel has a very low CAN priority (inherited from the ECU ROM initialization), so perhaps if I make the kernel's priority very high the problem will go away.

fenugrec commented 1 year ago

uses CAN ID 0xFFFFE which will override every other CAN ID

well that CANID is actually the second most low-priority ID, but if it has its own mailbox that should be fine.

Suffered from losing some packets

Which direction are you seeing packet loss ?

NPK_CAN.RXPR0.BIT.MB0 = 1;
memcpy(msg, (void *) &NPK_CAN.MB[0].MSG_DATA[0], 8);

I think you should be doing the memcpy before touching the RXPR bit (which essentially marks the mailbox as "emptied" if I understand right)

Does UMSR0 sometimes read as 1, indicating an RX overrun ?

if(!can_rx8bytes(currentmsg)) continue;

I think this test won't do what you want, since can_rx8bytes returns -1 "if no unread message available", so the rest of the while() block would run...

fenugrec commented 1 year ago

Also,

 static void can_cmd_dump(u8 *msg) {
   .......
    while (len) {
        ....
        can_tx8bytes(txbuf);
        len -= pktlen;
        addr += pktlen;
        can_idle(750);
    }

I think you should get rid of that call to can_idle . You already have can_tx8bytes waiting for the MB to be empty before putting in a new frame, and the usbtin driver should definitely be already quite capable of dealing with high traffic. If it isn't, then just hardcoding a delay like this is hardly a robust solution...

rimwall commented 1 year ago

Thanks again for the feedback / help!

that CANID is actually the second most low-priority ID

Ah, yes. High ID = Low priority. I had it the wrong way round. I guess it's still possible that ID 0x21 is occasionally getting overruled by a lower ID. That theory is supported by the lost packets only occurring in the car.

Which direction are you seeing packet loss ? I think you should get rid of that call to can_idle

RX from ECU to USBtinSubProg When I had no can_idle, the java program would successfully RX the first 20-30 packets and then would miss a whole bunch. This seemed like there was a buffer somewhere in the comms system that was getting overloaded or something in the system couldn't keep up the pace. But I don't know where that problem is, whether in the built java application, the USB interface, the USBtin or the ECU itself. And, yes, hard-coding the delay was a 'quick and dirty' solution to move past it and get success at 'proof of concept' stage. It does mean the whole process is slower than 500,000 bps would imply. But still fairly fast. I have an alternate approach now of dumping in 256 byte chunks, checking each chunk and repeating if bad or moving on (see repo updates). I haven't yet trialled with lower / zero can_idle values, but I think some sort of delay will still be required given what happened initially. Happy to hear ideas on different / better approaches?

TX from USBtinSubProg to ECU Initially I had similar issues with lost packets, but 'solved' it with the various wait() calls in USBtinSubProg. Another 'quick and dirty' solution, happy to improve it?

I think you should be doing the memcpy before touching the RXPR bit

I based this off the bootloader. And, yes, it also puzzled me it's done in this order. I didn't want to try a different order until other bugs were ironed out.

I think this test won't do what you want, since can_rx8bytes returns -1

Ah, good pick-up. I will change this so the continue is triggered for values of 0 or -1. I don't think this bug was the source of the above problems.

Does UMSR0 sometimes read as 1, indicating an RX overrun ?

I'm not sure, this isn't communicated in any way to the java application. I can do so if it's important?

fenugrec commented 1 year ago

Ah, yes. High ID = Low priority. I had it the wrong way round. I guess it's still possible that ID 0x21 is occasionally getting overruled by a lower ID.

Having a lower priority doesn't mean a packet will get lost - just means whoever is sending it will have to retry after losing arbitration. CAN hardware usually does this automatically unless specifically configured not to - there is a flag in HCAN for this, maybe check if it's set properly ?

RX from ECU to USBtinSubProg When I had no can_idle, the java program would successfully RX the first 20-30 packets and then would miss a whole bunch.

Hm. Do you have another can device you could hook up to sniff the bus to make sure those dropped packets are really being sent on the bus ?

Happy to hear ideas on different / better approaches?

You could do what higher-level protocols like iso15765 or canopen do, and put a sequence number in each frame (or a few bits of the can ID), and every N frame require an acknowledge from the receiver before continuing.

Btw, what is that protocol you implemented ?

TX from USBtinSubProg to ECU Initially I had similar issues with lost packets, but 'solved' it with the various wait() calls in USBtinSubProg. Another 'quick and dirty' solution, happy to improve it?

A more robust structure would be to IRQ on CAN mailbox RX, copy the frame to a FIFO from there, and drain that FIFO from cmd_loop .

RXPR before memcpy I based this off the bootloader.

What bootloader ? This really looks wrong.

Does UMSR0 sometimes read as 1, indicating an RX overrun ? I'm not sure, this isn't communicated in any way to the java application. I can do so if it's important?

Well if there's an RX overrun, you can be certain that at least one frame has been lost . Could be a problem in some cases, like receiving a block of data ?

rimwall commented 1 year ago

there is a flag in HCAN for this, maybe check if it's set properly ?

ok, checked the flag. DART is not set. So it should retransmit.

Do you have another can device you could hook up to sniff the bus

I only have the USBtin. Before I put in can_idle() I did get the kernel running in the ECU, terminate SubProg, start USBtinViewer and issued a dump command using USBtinViewer. I got the same result (ie) first 20-30 packets received ok and then some lost packets after that. The can_tx8bytes() function is so simple it must be TXing the data. So, it's not getting RX'd for some unknown reason.

put a sequence number in each frame (or a few bits of the can ID), and every N frame require an acknowledge from the receiver before continuing.

The new approach of transferring in 256 byte chunks is a little like this (see repo). I haven't added a frame number because that would reduce the number of data bytes I can send per frame. The new approach can handle these lost frames.

Btw, what is that protocol you implemented ?

The first two bytes duplicate how the bootloader works. After that, it's all made up.

A more robust structure would be to IRQ on CAN mailbox RX, copy the frame to a FIFO from there, and drain that FIFO from cmd_loop .

Mmm, ok. There was a FIFO buffer in the bootloader, but the RX didn't work on interrupts (RX was checked each loop cycle). So, I couldn't see how a FIFO buffer was of any use when 1 packet is being RXd and processed at a time.

What bootloader ? This really looks wrong.

The bootloader in the ECU ROM. I agree it looks weird, although it's been like that in Subaru ROMs since 2005. Here's the assembly. r14 has the value of 1, and 0xFFFFD042 (RXPR0) is definitely set before the data is copied from the MB.

        000003bc 24  e1           mov.w      r14 ,@r4=>DAT_ffffd042                           = ??
        000003be e5  00           mov        #0x0 ,r5
        000003c0 d3  31           mov.l      @(->Func_Copy_values_at_B+D_to_C_from_B_to_A+B   = 00000cf0
        000003c2 97  56           mov.w      @(Constant_0xD108 ,pc),r7=>DAT_ffffd108          = D108h
                                                                                             = ??
        000003c4 66  f2           mov.l      @r15 =>local_c ,r6
        000003c6 43  0b           jsr        @r3=>Func_Copy_values_at_B+D_to_C_from_B_to_A+B  undefined Func_Copy_values_at_B+
        000003c8 e4  08           _mov       #0x8 ,r4

So, a few things to try:

Now I'm using the 256 byte chunk approach, I'll try reducing the can_idle() duration and see what happens to overall speed -vs- lost packets.
try doing the TX from SubProg to ECU using CAN ID 0x21 (instead of 0xFFFFE) to increase the priority and see what happens
I will try re-ordering the setting of RXPR0 to see if that makes any difference.
maybe implement a FIFO and interrupt based RX

fenugrec commented 1 year ago

The bootloader in the ECU ROM.

Ok, I don't follow. You're saying there's a CAN bootloader in the subaru ROMs ? what is its purpose if not for reflashing ?

Mmm, ok. There was a FIFO buffer in the bootloader, but the RX didn't work on interrupts (RX was checked each loop cycle).

Interesting. FIFO helps when you expect to receive multiple frames in a burst and you know you won't have time to process all of them - so you try to empty the mailbox as fast as possible. If the host is capable of sending data way faster than your code can process, then you need some higher-level flow control (like your 256-byte chunks maybe) to force the sender to wait before sending more.

RXPR

Found this : https://en-support.renesas.com/knowledgeBase/18030795

Unlikely to be causing all your problems, but should be addressed.

Also, in the 7058 DS, figure 17.11, has a flowchart that confirms you should read the mailbox before writing 1 to RXPRn.

* maybe implement a FIFO and interrupt based RX

If you decide to go that route, I can recommend https://github.com/EliasOenal/etools . Here's a trimmed-down example, from another project I'm working on :

// allocate space for ecbuff struct as well as the raw data
static _Alignas(ecbuff) uint8_t fifo_sdo_rawbuf[sizeof(ecbuff) + (FIFO_SDO_LEN * CANFRAME_SIZE)];
static ecbuff *fifo_sdo = (ecbuff *) &fifo_sdo_rawbuf;

init_stuff() {
    ecbuff_init(fifo_sdo, (FIFO_SDO_LEN * CANFRAME_SIZE), CANFRAME_SIZE);
    ...
}

CAN_ISR() {
    struct canframe_ts cf;
    mbx_read(&cf);  //copy CAN frame + stuff into struct

    if (!ecbuff_write(fifo_nmt, &cf)) {
        ... handle overflow etc
    }
    ...
}

main_loop(void) {
    struct canframe_ts cf;

    while (ecbuff_read(fifo_sdo, &cf)) {
        process_sdo(&cf);
    }
}

fenugrec commented 1 year ago

Also found a snippet of code in a SH7047 appnote :

...
while((HCAN_RXPR0 & 0x0001) != 0x0001); /* Wait for reception end */

MBbuff = HCAN_MB0.MD7;  /* Store received data in on-chip RAM */
HCAN_RXPR0 = 0x0001;    /* Clear reception end flag (to clear, write 1) */
while(1);
...

rimwall commented 1 year ago

You're saying there's a CAN bootloader in the subaru ROMs ? what is its purpose if not for reflashing ?

Yes, there is a very minimal CAN based bootloader in many (all?) Subaru ECU ROMs (in the space up to 0x1000). It is for loading, checking and jumping to a RAM located kernel. No flashing, that must be done by the kernel.

Thanks for the other info. I’ll work on updates and trial them over the next few days.

rimwall commented 1 year ago

ok, I updated the TX and RX functions...

 static int can_rx8bytes(u8 *msg) {
    if(!NPK_CAN.RXPR0.BIT.MB0) return 0;
    if(!NPK_CAN.UMSR0.BIT.MB0) {
        memcpy(msg, (void *) &NPK_CAN.MB[0].MSG_DATA[0], 8);
        NPK_CAN.RXPR0.WORD = 1;
        return 1;
    }
    NPK_CAN.UMSR0.WORD = 1;
    return -1;
 }

 static void can_tx8bytes(const u8 *buf) {
    while (NPK_CAN.TXPR0.BIT.MB1) { };
    NPK_CAN.TXACK0.WORD = 2;
    memcpy((void *) &NPK_CAN.MB[1].MSG_DATA[0], buf, 8);
    NPK_CAN.TXPR0.WORD = 2;
    return;
 }

...and I still needed can_idle() for it to work. And then, I tried some other things, reversed them, reconnected my alligator clips, and it worked without can_idle(). So now I'm wondering if part of the issue is connection quality from alligator clips. The OBD plug arrives soon, so then I'll be able to use solder connections. This is all bench testing. Haven't tested in a car yet.

I can't figure out why the ROM would have a FIFO queue in conjunction with a simple message processing loop. If the loop is activated every time a message is received, and that message is immediately processed in the loop, then I can't see how any received messages would ever be queued. Unless I'm missing something...

If I were to implement interrupt based RX, given kernel uploading / dumping / flash uploading are usually many hundreds of packets, I'm not sure how a queue would help, unless it was enormous...?

fenugrec commented 1 year ago

} NPK_CAN.UMSR0.WORD = 1; return -1; }

Probably should clear RXPR too before returning -1 ?

I can't figure out why the ROM would have a FIFO queue in conjunction with a simple message processing loop. [...] then I can't see how any received messages would ever be queued. Unless I'm missing something...

I haven't seen their bootloader, but on a 500kbps link it takes about 220us to transmit an 8-byte frame (45 bits overhead + DLC 8, ignoring bit-stuffing), so that if any part of the processing loop takes more than 2220us, then one frame will occupy the MB but the next will be dropped. Depending on how the protocol is designed (i.e. is there flow control for the slow operations), this may or may not be necessary.

For example, if host triggers platf_flash_init , that's a fairly long operation. But that shouldn't be a problem because (as I understand) the host waits for the response before sending anything else. However, I'm less sure about can_cmd_flash_128bytes() - can the host send frames faster than you can process + memcpy them into the 128-byte buffer ? I don't know.

Again, that's where checking the UMSR bit is useful - the hardware will tell you if it dropped a frame. Why not have it set a flag that you can check with some other command, or a RAM read, or as a special error return code etc ?

For the K-line comms, I get away without a fifo because

I have a buffer that can hold a full iso14230 frame (up to 256 bytes payload)
host always waits for a response after every request
1 byte @ 62.5kbps = 160us , that's how long I have to parse each byte as they come in. Processing is fairly simple in iso_parserx() , although TBH I haven't measured the worst-case execution time for that.

True, maybe (hopefully) you don't need a FIFO. Only way to be sure is check UMSR - if it's sometimes set, then you definitely have problems. If not, it's no guarantee but maybe you'll be ok.

rimwall commented 1 year ago

Probably should clear RXPR too before returning -1

Yes, agreed, thanks, done.

I haven't seen their bootloader,

If you want to have a look, see attached. This is the first time I have tried to extract a portion of a Ghidra file, so hopefully this works. There are 3 files:

a 'Ghidra zip file' generated via the "Export" menu option, hopefully you can open this and see all the function naming etc. I've not opened one of these files before.
a *.c file generated by the decompiler
a *.hex file as the raw binary Hopefully my function naming / variable naming / comments will make it easy to follow. Note: I used Ghidra's 'Version Tracking' which is a sort of bin-diffing capability that is built in to Ghidra. It did a good job of transferring the function names, but didn't transfer any data-item data types or names. Happy to do this again if you know a better way of extracting a portion of a disassembly file along with all function names / data types / data labels etc.

Only way to be sure is check UMSR

I have added another error code for UMSR (see below), but it is never triggered. Even when I remove wait() from the host program so it pushes data too quickly for the kernel to cope. Instead I get other errors (eg) checksum errors. So, for whatever reason, it seems UMSR is never triggered.

    while (1) {

        int rv = can_rx8bytes(currentmsg);
        if(rv == 0) continue;

        if(rv == -1) {
            txbuf[0] = 0x7F;
            txbuf[1] = (currentmsg[1] & 0xF8) | 0x01;
            txbuf[2] = 0x36;   // unread message error
            can_tx8bytes(txbuf);
            continue;
        }

        if (loadingblocks) {

GhidraBootldr.zip