fenugrec / npkern

Nissan / Infiniti ECU reflashing kernel, for use with nisprog
GNU General Public License v3.0
48 stars 13 forks source link

Could this work for a different ECU using SH7055? #15

Open farmdve opened 1 year ago

farmdve commented 1 year ago

From my research so far I have seen you need a bootloader to change ECU code or maps. My ECU is from a Volvo S60 2005, they use CAN communication though using a J2534 device or ISO15765.

Anyway, in order to flash ECU changes or maps, I definitely need a bootloader(I am guessing this kernel is exactly this). Could this kernel work in my specific case?

fenugrec commented 1 year ago

npkern has been ported to some subaru & mazda (work in progress, I believe) ECUs . I'm waiting for some progress on the people working on those ports to bring their changes into npkern to make it more generic and customizable.

So yes, possible, but you need a good (better) understanding of what your ECU needs. I.e. some ECUs like the more recent Nissan ECUs don't need npkern at all.

farmdve commented 1 year ago

As far as the work has shown, there are CAN commands to send a bootloader, transfer control to it and send the bootloader data to write to memory. However that bootloader is for Volvo's BOSCH ME7/ME9 ECUs, but the communication structure is the same for Denso SH7055 and the commands largely the same, but the ME7/ME9 bootloader is written for a different MCU so doesn't apply for my use case otherwise I would've used it.

fenugrec commented 1 year ago

there are CAN commands to send a bootloader, transfer control to it and send the bootloader data to write

Ok good, wanted to clarify your teminology. Bootloader sometimes applies to part of the stock ROM that accomplishes the same, or also to the built-in low-level mcu functions that you would use to unbrick or initial flash.

Other things you need to figure out is what kind of housekeeping npkern needs to do to keep the rest of the ECU hardware happy - for Niss / Sub, that is toggling a pin at ~ 250Hz.

farmdve commented 1 year ago

I have a question. I know these EEPROMS and Flash chips have limited write-erase cycles. Does only changing a few bytes contribute to using up this limited capacity or the whole chip every time?

fenugrec commented 1 year ago

From the README here

- These SH705* ICs are typically rated for about hundred re-write cycles, beyond which Flash retention may degrade.
Experience seems to indicate the actual endurance can be significantly higher, but as with any Flash memory, excessive re-writes should be avoided (for example, live-tuning applications)

EEPROMs are typically way more durable than that. For 705x it would be a per-block count, e.g. whatever you need to erase before reflashing.

farmdve commented 1 year ago

I researched the code a bit more. I don't see references to external flash memory selected by the CS0-3 signals. Does this mean that Subaru does not store the maps in external flash memory(such as 29lv200bc) but directly on the built-in ROM of the 7055 MCU? If that is the case - I will need to write a kernel to access the flash chip for my use case, as this kernel only supports changing the ROM itself on the 7055 built-in memory, if I understood correctly.

fenugrec commented 1 year ago

There is no external flash on any of the ECUs I'm familiar with (Nissan gasoline Denso ECUs ~ 2001-2015 and wherever similar Denso units are used, e.g. Subaru, some Mazda) so yes, it's all in the built-in ROM of the 705x

farmdve commented 1 year ago

I understand. Sadly that is not the case for me, all Volvo Denso ECUs(and they were used in production only for a shortwhile) use a flash memory chip to hold the maps, separate from the control logic of the ECU. Of course I am more than willing to try write some code for that.

One last question I have is, can you tell me how you conducted your testing/debugging of the kernel code? I am doing my work all on the bench with a test ECU so even if I brick my ECU I do not lose much, but I would still prefer to not brick my test unit just so I can work on it. I know of HEW, would that be useful to have?

Tools I have: A 2-channel 100mhz, 1G sample oscilloscope with 40k memory - was a cheap scope. Soldering iron station with hot air(if need be) And things like Arduino and Raspberry Pi. Sadly I have no logic analyzer if it were needed.

Thank you.

fenugrec commented 1 year ago

all Volvo Denso ECUs [...] use a flash memory chip to hold the maps

Ah, interesting. On the plus side that means you should be safer from bricking the ECU.

One last question I have is, can you tell me how you conducted your testing/debugging of the kernel code?

Bench ECU, and a few times I used an AUD interface to monitor RAM locations. Also used HEW simulator to develop the initial startup assembly code that does a bit of tricky work to move the payload in RAM. Never had to unbrick but that only needs minimal hardware (5V uart to PC, 240Hz square wave generator and a handful of parts - https://nissanecu.miraheze.org/wiki/Bootmode )

You will probably not need much else; if you are able to get npkern to run in RAM and respond to commands, that is IMO the hard part. You can also compile out the reflash functions to obtain a sterile kernel that can't modify the 7055 firmware.

farmdve commented 1 year ago

So, I also have something that causes the ECU to reset. I actually ran my own little main stub with a while loop writing to a single memory location in RAM of my choosing) to ensure my code upload works. Unfortunately as soon as I execute it, the ecu resets and furthermore, the RAM locations I wrote my little 0xDEADBEEF constant get erased. That or my code never ran in the first place, I can't be certain what happens to the state.

By setting up my own vbr and populating it with dummy ISRs, could I at least ensure my code actually works or would the reset wipe everything clean and read the VBR at addresses specified in the READONLY on-chip ROM?

fenugrec commented 1 year ago

So you're at the "hello world" step...

need to find out what is resetting the mcu.

you can either

And you will need a way to get early life signs from your code, either a random IO pin that you toggle and look at with a scope / logic analyzer, sending one special CAN frame, whatever.

farmdve commented 1 year ago

I might have to bust out the scope.

Anyway, my compiled code roughly looks like this(just an excerpt) where I tried to disable the internal WDT by setting the TME bit to 0 as per the docs.

RAM:FFFF810C mov.w #WDT_TCSR[RW]_B, r1 RAM:FFFF810E mov.w #unk_FFFFA5EF, r2 RAM:FFFF8110 mov.w r2, @r1

I was only writing to this register and didn't care about the original value at all so I directly wrote this value.

Anyway, the above code also causes a reset, provided it did anything at all and I wasn't raising some exception for some reason.

If my code did indeed disable the WDT, then an external signal is resetting the chip. Quite possible because on my Denso ECU there is one mysterious chip whose function is unknown, called SE412.

As for searching for said code, I found that gcc creates some really nasty code where only one address is loaded in a register and incremented by some amount dynamically to create the next address. But so far my searches have discovered these three functions that directly write to the TCNT register https://i.imgur.com/Gy6zgVi.png Second I

I have to say, kudos to you for figuring out the ATU thing from the docs. It's by far I think the most difficult and densest part of the docs and chip. With overlapping register names to other modules on the chip and so forth.

fenugrec commented 1 year ago

RAM:FFFF810C mov.w #WDT_TCSR[RW]_B, r1 RAM:FFFF810E mov.w #unk_FFFFA5EF, r2 RAM:FFFF8110 mov.w r2, @r1

Umm I recommend double-checking that; and you should probably be disabling it via RSTCSR instead.

As for searching for said code, I found that gcc creates some really nasty code where only one address is loaded in a register and incremented by some amount dynamically to create the next address

Ghidra is much better at that than the version of IDA I have (7.4). Not sure what you're using there. Consider trying "findrefs" from my https://github.com/fenugrec/nissutils repo.

I have to say, kudos to you for figuring out the ATU thing from the docs. It's by far I think the most difficult and densest part of the docs and chip

That peripheral is indeed a monster. You'll notice I use it in the simplest fashion possible. But "most difficult" - I disagree, I would give that award to the 350nm reflash procedure in the ROM chapter !

farmdve commented 1 year ago

It seems like you also feel the same way about how GCC generated the 'devious' instructions. And yes I mostly use IDA 7.3. I have indeed seen IDA disassemble some instructions incorrectly which Ghidra disassembled correctly, but so far IDA was faster in loading the files and a little bit more legible.

Anyway, why do you feel the above instruction looks weird? My reasoning was if the watchdog timer never increments, it can never overflow.

I also hooked my oscilloscope to the SH7055, and saw that the RES pin(number 58) was being driven low when I try to execute my code. Unfortunately I could not verify from the docs if an external circuit is driving the RES pin low or if the internal circuitry is temporarily driving the pin low between states. The duration of the RESET is 20ms.

Another thing I could potentially do is hookup my scope to the WDTOVF signal pin and see if my attempts to disable the pin(via the WDT configuration addresses) work by measuring the pin state. If I disable the pin I should never see a signal outputted here.

fenugrec commented 1 year ago

Anyway, why do you feel the above instruction looks weird? My reasoning was if the watchdog timer never increments, it can never overflow.

I mean : you're writing A5EF which doesn't look right. And I think you should be disabling the internal WDT via the RSTCSR register anyway.

farmdve commented 1 year ago

I can explain the value. 0xA5 in the upper byte is required to access the register TCSR, as the same register overlaps with TCNT. And EF is bascally writing 0 to the TME bit, I was originally going to be ANDing the original value and disabling just the TME bit, but figured it would be pointless and just overwrote everything, also it allows me to easily at a glance verify the correctness of the code via IDA.

fenugrec commented 1 year ago

0xA5 in the upper byte is required to access the register TCSR

Yes I recognize that, I used similar (https://github.com/fenugrec/npkern/blob/master/pl_flash_7055_350nm.c#L104 ). Look at the datasheet again - 0xEF writes a 1 to OVF, WT/IT, and TME bits.

farmdve commented 1 year ago

Ah, I saw my mistake now. I got the bit positions wrong, I'll fix that.

Once I have a working version for Volvo, I plan to make a pull request to npkern if that is alright with you?

fenugrec commented 1 year ago

Once I have a working version for Volvo, I plan to make a pull request to npkern if that is alright with you?

Yes absolutely. If you need to change common code, please keep those changes in separate commits to make review easier.

farmdve commented 1 year ago

Someone was kind enough to provide me an original kernel(in binary format) and took me a lot of brainstorming, reading the code.

I did some experiments. An empty while loop actually works. Nothing triggers the reset of the ECU my previous code where I set the pointer to execute the uploaded code, was wrong so I fixed the uploading code. But now, as soon as I attempt to write to a memory location, the ECU resets. And I think I know why.

Here is my code (*(volatile unsigned int *)0xFFFF9200) = 0xdeadbeef; however notice anything strange in the disassembly?

RAM:FFFFA000 ; Segment type: Regular
RAM:FFFFA000                 .section RAM, UNK
RAM:FFFFA000                 mov.l   #h'DEADBEEF, r2
RAM:FFFFA002                 mov.w   #h'FFFF9200, r1
RAM:FFFFA004                 mov.l   r2, @r1
RAM:FFFFA006                 rts
RAM:FFFFA008                 nop
RAM:FFFFA008 ; ---------------------------------------------------------------------------
RAM:FFFFA00A word_FFFFA00A:  .data.w h'9200          ; DATA XREF: RAM:FFFFA002↑r
RAM:FFFFA00C dword_FFFFA00C: .data.l h'DEADBEEF      ; DATA XREF: RAM:FFFFA000↑r

No matter what I do, gcc seems to treat my address as 0x9200, that's definitely wrong. This generated code is attempting to write directly to the flash chip, it's a given it would generate an error and reset the chip.

I compared my code to the original volvo kernel, it's definitely loading RAM addresses with mov.l and are 4 bytes, not words 2 bytes.

fenugrec commented 1 year ago

RAM:FFFFA002 mov.w #h'FFFF9200, r1 RAM:FFFFA004 mov.l r2, @r1 RAM:FFFFA006 rts RAM:FFFFA008 nop RAM:FFFFA008 ; --------------------------------------------------------------------------- RAM:FFFFA00A word_FFFFA00A: .data.w h'9200 ; DATA XREF: RAM:FFFFA002↑r

No matter what I do, gcc seems to treat my address as 0x9200, that's definitely wrong.

no it's fine, it's mov.w sign-extension