Revised lkr and start files for Subaru SH7058

rimwall commented 2 years ago

Hi, I've checked a few ROMs from a similar timeframe (~2005) and they all seem to use the same RAM Jump address of 0xFFFF3004. The normal SP (0xFFFFBFA0) and SP+4 are used to store the start and end address for the kernel upload. Both are initialised to 0xFFFF3000 and then changed to the SID 0x34 addresses. Nothing seems to be preloaded into 0xFFFF3000 and it seems likely that this address is intended to hold the balancing amount to get to a checksum of 0x5AA5

So, I'm guessing the mods required to these two files are:

remove the microkernel
remove the logic that shuffles the payload around in RAM
change the kernel base to 0xFFFF3000 (assuming a checksum balance prefix) and RAM jump destination of 0xFFFF3004
the stack can probably stay at 0xFFFF6000 (if I've correctly guessed this bit)

I've had a look at lkr_7055_7058.ld and start_705x.s. I'm not familiar with these types of files. I think I understand what is going on, but editing them is another matter. Hopefully it is a quick exercise to cut out the extra complexity required by Nissan ROMs. Happy to help however I can. Thanks!

fenugrec commented 2 years ago

See the most recent commits. I added a skeleton target for subaru, and did most of the required tweaking. You should be able to pick up from there. I wasn't sure if your very first SID 36 transfer ends up at '3000 or '3004 ? if it's 3000 then you'd need to change the .ld file to get the section to start at '3000, and add 4 padding bytes with e.g. LONG(0) , see https://sourceware.org/binutils/docs/ld/Output-Section-Data.html (in that case, you would need nisprog to overwrite those 4 bytes before sending the payload, of course. It's simpler to do that in nisprog rather than add post-compilation steps to go set those bytes manually)

rimwall commented 2 years ago

Thanks again for the help. I tried compiling but it seems to be complaining about the same label having been already defined in the Nissan kernels (see right down the bottom)...

About the SID calls for the RAM Jump... one approach would be:

SID 0x34 starting at 0xFFFF3004 and length of kernel_length SID 0x36 to transfer the kernel to the above addresses SID 0x34 again with a start at 0xFFFF3000 and length of 4 (ie) 4 bytes and a payload of 0x00 0x00 0x5A 0xA5 SID 0x31 to RamJump. It will test the checksum against the most recent SID 0x34 address range and find a good checksum, and then it is hard-wired in the ROM to jump to 0xFFFF3004 All of the above can be done in nisprog... not sure if this helps simplify the .ld and .s files?

After briefly looking at a few more ROMs, 0xFFFF3000 shows up a lot. 0xFFFF3004 seems to occur around 2005, but disappears sometime before 2009. So, this approach is unlikely to be universal. But any modifications for later vehicles should hopefully be relatively straightforward once there is a working solution.

Scanning dependencies of target ssmk_SH7058 [ 80%] Building C object CMakeFiles/ssmk_SH7058.dir/cmd_parser.c.obj [ 82%] Building C object CMakeFiles/ssmk_SH7058.dir/eep_funcs.c.obj [ 85%] Building C object CMakeFiles/ssmk_SH7058.dir/main.c.obj [ 87%] Building C object CMakeFiles/ssmk_SH7058.dir/crc.c.obj [ 90%] Building ASM object CMakeFiles/ssmk_SH7058.dir/start_705x.s.obj [ 92%] Building ASM object CMakeFiles/ssmk_SH7058.dir/start_ssm.s.obj [ 95%] Building C object CMakeFiles/ssmk_SH7058.dir/platf_7055.c.obj [ 97%] Building C object CMakeFiles/ssmk_SH7058.dir/pl_flash_705x_180nm.c.obj [100%] Linking C executable ssmk_SH7058 CMakeFiles/ssmk_SH7058.dir/start_ssm.s.obj: In function zero_bss': (.rja+0x0): multiple definition ofRAMjump_entry' CMakeFiles/ssmk_SH7058.dir/start_705x.s.obj:(.rja+0x0): first defined here CMakeFiles/ssmk_SH7058.dir/start_705x.s.obj: In function rja_start_adj': (.rja+0x24): undefined reference to_rja_start' collect2.exe: error: ld returned 1 exit status CMakeFiles\ssmk_SH7058.dir\build.make:205: recipe for target 'ssmk_SH7058' failed mingw32-make.exe[2]: [ssmk_SH7058] Error 1 CMakeFiles\Makefile2:220: recipe for target 'CMakeFiles/ssmk_SH7058.dir/all' failed mingw32-make.exe[1]: [CMakeFiles/ssmk_SH7058.dir/all] Error 2 Makefile:89: recipe for target 'all' failed mingw32-make.exe: *** [all] Error 2

fenugrec commented 2 years ago

Thanks again for the help. I tried compiling but it seems to be complaining about the same label having been already defined

Did you modifiy CMakeLists ? It shouldn't be compiling start_705x.s for the subaru target. Also try "make clean" before recompiling.

. It will test the checksum against the most recent SID 0x34 address range

Ah, I had no idea. If that indeed works and you're certain that your last SID34 with 4 bytes doesn't clear the previous areas, then what you outlined could work.

Re 3000 vs 3004 : I assume you don't have control over the RAMjump address with SID31 . So, do I understand correctly that the RAMjump destination varies, but is "always" 3000 or 3004 ? If so, simple solution : modify start_ssm.s to have 2 NOP instructions (total 4 bytes) before actual code, and modify the .ld file to link at 3000. Then, always SID34 to 3000, put your 4 dummy checksum bytes after the kernel; no matter if RAMjump lands on 3000 or 3004, it will work as intended.

rimwall commented 2 years ago

Did you modifiy CMakeLists ?

I did but... silly me... I missed the change to Line 47. Apologies. I now realise what those bits of CmakeLists.txt do! Compiles fine now.

I assume you don't have control over the RAMjump address with SID31 . So, do I understand correctly that the RAMjump destination varies, but is "always" 3000 or 3004 ?

Not sure. In my ROM, the jump address for SID 0x31 is hard-wired in the code at 0xFFFF3004 and searching the whole ROM provides only one location where the value 0xFFFF3004 is stored, and a few locations where 0xFFFF3000 is stored (for checking the landing address of SID 0x34). For other ROMs from around the same time (2005) that was also the case (ie) one instance high up in the ROM of 0xFFFF3004 and a couple of 0xFFFF3000. However, ROMs further away in time had no reference to 0xFFFF3004, but did have references to 0xFFFF3000. Without digging in the detail of those ROMs, it could be that the jump address has changed to 0xFFFF3000, or it could be the assembly has changed to an offset of 4 from 0xFFFF3000. I'll see if I can clarify this point today.

I do know that the "flashing method" has changed over time because other flashing tools have a number of "flashing methods" to chose from that change with the age of the vehicle. I suspect it's not the "flashing method" that is changing, but the SID 0x34, 0x36, 0x31 process. So, eventually, a few different code branches may be required depending on the "flashing method".

I'll see if I can successfully get the .s and .ld files to do what I want...

rimwall commented 2 years ago

I'll see if I can clarify this point today.

Looked at a 2009 ROM with no RE work done on it. The SID 0x31 command jumps to 0xFFFF3000. Other differences too... 32bit checksum still simple addition but required value is 0x5AA5A55A (not backwards compatible with 0x????5AA5) and the data content of a valid SID 0x31 command has changed. Looks like different SID functions in nisprog will be required for different eras of cars. Something to worry about after its working for 2005-era cars...

fenugrec commented 2 years ago

Looks like different SID functions in nisprog will be required for different eras of cars. Something to worry about after its working for 2005-era cars...

Heh, interesting. But 100% agreed; start with getting it to work on one ECU. And, to state the obvious, write down all of these details and exceptions - what works on what, etc. Even if you don't end up implementing everything yourself, at least you leave a good starting point for others to pick up where you left off.

rimwall commented 2 years ago

ok, I think this is correct... .ld file changed so that ORIGIN = 0xFFFF3000 and .s file changed so that there are 2 x nop under RAMjump_entry. Checked kernel in Ghidra and it has 2 x nop at the top, stack at 0xFFFF8FFC, main at 0xFFFF391C, bss at 0xFFFF3EC8 and ebss at 0xFFFF46F8. Seems like its ready to try...

rimwall commented 2 years ago

One problem I think I've picked up... the ECU puts 0xFFFF3004 into r4, and then does a JMP to a two instruction function that firstly puts the standard SP address in r15 and secondly does a JMP @r4. So it looks like 0xFFFF3004 holds the address of the kernel start point (ie) 0xFFFF3004 is not the kernel start point,

So I set ORIGIN to 0xFFFF3008 in lkr_subaru_7058.ld, and loaded ssmk_SH7058.bin into RAM starting at 0xFFFF3008 and put the value of 0xFFFF3008 at 0xFFFF3004. But no joy. I can load the whole kernel, pass the checksum check and get a positive response to the RamJump command (SID 0x31). After that no response.

SID 31 done. ECU now running from RAM ! Disabling periodic keepalive; diag_l1.c:155: _send: len=3 P4=0 l0flags=0x1000; 0x01 0x81 0x82 diag_l1.c:252: _recv request len=1024, timeout=60;diag_l2.c:542: Read/Write timeout. npk_init: startcomm failed : -8 Problem starting kernel; try to disconnect + set speed + connect again.

I wondered whether the following might be the problem:

is SCI1 hard-wired anywhere else in npkern code?
if the kernel address is at 0xFFFF3004, then maybe the stack pointer is supposed to go at 0xFFFF3000, not sure why that would matter
other possibilities...?

fenugrec commented 2 years ago

secondly does a JMP @r4.

From the SH-2E software manual:

JSR @Rm The jump destination is an address specified by the 32-bit data in general register Rm.

Unless I misunderstood you, I think you're on the wrong track. mov r4, ffff3000 ; then jmp @r4 will jump to 3000.

I can load the whole kernel, pass the checksum check and get a positive response to the RamJump command (SID 0x31).

That's excellent. Consider what I suggested earlier (there's even commented-out code for this in wdt_tog() ) about generating a unique pattern on PB15, and connecting a scope or logic analyzer to confirm the kernel is running.

* is SCI1 hard-wired anywhere else in npkern code?

I don't think so off the top of my head; a text search will tell. You can look at the 7050 parts since on that target also they use SCI2.

* if the kernel address is at 0xFFFF3004, then maybe the stack pointer is supposed to go at 0xFFFF3000

Without seeing the pre-ramjump code, or an "official" kernel", hard to say.

rimwall commented 2 years ago

mov r4, ffff3000 ; then jmp @r4 will jump to 3000.

Ah, yes, I was looking for reasons why it wasn't working and got confused by the example in the software manual on p96 of 336. I previously had it with the kernel at 0xFFFF3004 and that wasn't working either.

connecting a scope or logic analyzer to confirm the kernel is running.

Not sure how I'd get hold of one of those. Anyway, I'll call it a day/night and have another crack tomorrow!

fenugrec commented 2 years ago

Not sure how I'd get hold of one of those.

You can get creative... generate an audible-frequency pulse train, hook up a resistor + small speaker (or audio amp / soundcard), and listen for your tone...

Or, modify some of the code to just send out a stream of 0x55 bytes on the K line (without changing speed), and see if any of those make it through.

rimwall commented 2 years ago

I fixed up the jump address, and put the kernel at 0xFFFF3004 with a checksum at the end. That gets a positive response to SID 0x31 (RAM Jump), but then no response. I tried reducing the kernel speed in nisprog and npkern down to 15,625 bps but still no response. After getting no response I tried disconnecting and reconnecting (leaving the ECU on the whole time). I was surprised to find the ECU was back to normal operation (ie) it responded to a connection request and normal SSM commands.

So it seems like program execution is either a) not getting into the kernel (seems unlikely because I can't see how the ROM code in the SID 0x34/0x36/0x31 has a path back) or b) falling out of the kernel somehow...

Maybe this means the watchdog isn't satisfied and forcing a reset...?

fenugrec commented 2 years ago

Maybe this means the watchdog isn't satisfied and forcing a reset...?

That is exactly what I suspect.

rimwall commented 2 years ago

I've tried different MAX_WDTCNT values, but it still appears to be resetting the ECU. Looks like npkern uses one of the ATU timers which presumably starts at zero when the kernel starts. The WDT is already operating when in the SID 0x36/0x34/0x31 part of the ROM, so perhaps I need to edit the kernel to use the TNCT timer attached to the WDT, instead of the ATU timer.

Next thing to try...

fenugrec commented 2 years ago

I've tried different MAX_WDTCNT values, but it still appears to be resetting the ECU.

Are you sure you got the right pin ?

init_wdt() already initializes (and resets) ATU1.TCNTB so there should be no need to change timer peripherals. If you mean the stock ROM uses the mcu's internal WDT, then you may need to either deactivate it, or clear WDT.TCNT regularly in wdt_tog()

And, as I suggested :

You can get creative... generate an audible-frequency pulse train, hook up a resistor + small speaker (or audio amp / soundcard), and listen for your tone... Or, modify some of the code to just send out a stream of 0x55 bytes on the K line (without changing speed), and see if any of those make it through.

rimwall commented 2 years ago

Not making much progress. I haven't changed the kernel to send out a stream of 0x55 bytes because execution doesn't seem to be staying in the kernel. Also I'm not sure how to receive such a stream of 0x55 bytes either...

It's definitely PB15. This is the function that is called each loop of the SID 0x36/0x34/0x31 area of the ROM. Firstly in assembly:

...and the decompiler output plus comments...

It doesn't use interrupts like the npkern code. Rather, it just makes sure it checks the OVF flag often enough that if an overflow has occurred, it resets and starts counting again. In the prior comment, I was wondering whether the switch from WDT.TCNT to ATU1.TCNTB causes a problem. Say, for example TCNT has almost overflowed when the RAMJump happens. ATU1.TCNTB then starts at zero and counts up to WDT_MAXCNT. So the time elapsed between those two PB15 pulses could be up to 2 x WDT_MAXCNT. I tried to see if this is the problem by setting WDT_MAXCNT to 1 ms, and then changing wdt_tog() to be:

void wdt_tog(void) { if ( (WDT.READ.TCSR.BYTE & 0x80) != 0) { WDT.WRITE.TCSR = 0xA53D; *wdt_dr ^= wdt_pin; } return; }

(don't know why github shows that code as one line) ...but it still drops the ECU back to normal operation.

I also tried writing an asm() version of the function by copying the SH commands, but I obviously wasn't getting the syntax right and couldn't get it to compile.

Next step might be to remove all the npkern interrupt handling and simply call the above wdt_tog() every loop of cmd_loop(). Only reason for trying that is if the npkern interrupt handling is somehow picking up the WDT ITI interrupt.

I also discovered that the ECU sets pin F7 right after it sets up the WDT, so I tried turning that pin off to see if it turns off the external watchdog circuit. That didn't work either. Which makes me wonder if the problem is unrelated to the external watchdog... because the ECU doesn't use the WDT function in normal operation, so there must be a way to turn it off.

Running out of ideas...

fenugrec commented 2 years ago

Also I'm not sure how to receive such a stream of 0x55 bytes either...

Well you're compiling nisprog anyway... after RAMjump, just diag_l1_recv() with a big timeout ?

It's definitely PB15

Ok, and you're absolutely sure the proper register address and mask are getting loaded ?

So the time elapsed between those two PB15 pulses could be up to 2 x WDT_MAXCNT.

Ah I see what you mean now. Nissan is fairly lenient about the timing so I never had to give this thought.

What if you call wdt_tog() early in kernel initialization (after init_wdt() ofc); surely the external WDT won't mind two consecutive pulses ?

Still, without actually probing PB15, you're working in the dark. When is it resetting ? does the kernel have time to generate 0, 1, or a few pulses before reset ? etc.

don't know why github shows that code as one line

needs either triple "`" backticks, or paste your code + select block + click "code"

if the npkern interrupt handling is somehow picking up the WDT ITI interrupt.

Unlikely ... in platf*.c, build_ivt() initializes all IVT entries to the same value except 2 stack-pointer locations, and the INT_ATU11_IMI1A interrupt.

Do you know for sure if there's no additional external IC that needs to be kept happy, say on the other SCI pins, by sending a stream of serial data ? I recall reading that somewhere on the RR forums but I forget which ECU it referred to.

Running out of ideas...

Two directions will lead to success in reasonable time. One is getting your hands on an "official" kernel, or a full log / capture of a working reflash.

Otherwise, you're going to have to decide on a way to probe PB15 and other signals. It doesn't take much. A logic analyzer / arduino / RPi / small speaker / soundcard with those janky "soundcard oscilloscope" programs... Any reseller of chinese garbage will have those small cheap logic analyzers for under 10$. An oscilloscope would be more versatile in the long term but maybe a bit more money. You can still get something usable for less than than filling up at the gas station !

rimwall commented 2 years ago

Thanks for the suggestions...

you're absolutely sure the proper register address and mask are getting loaded ?

Yes:

    rjp.wdt_portH = PB.DR.BYTE.H;
    rjp.wdt_portL = PB.DR.BYTE.L;
    rjp.wdt_pin = 0x8000;

What if you call wdt_tog() early in kernel initialization

I had tried that without success

Do you know for sure if there's no additional external IC that needs to be kept happy

I don't think so. I've not seen that mentioned anywhere.

I tried a 'baby step' and jumped into a very simple kernel that checks WDT OVF status and toggles PB15 (if required) exactly like the ROM code, plus an outer infinite loop. It seemed to work (ie) the ECU remains unresponsive until after a power cycle.

So, it appears something to do with the use of the ATU/interrupts in npkern is not keeping the Subaru external dog happy. I checked clock frequency... Subaru uses an oscillator at 10 MHz to give a peripheral clock of 20 MHz and a 4x multiplier to provide an internal clock of 40 MHz. The 20 MHz peripheral clock speed seems to be the same as Nissan based on the comments in the npkern code. So that's not it. Perhaps the interrupt vector gets placed in RAM somewhere Subaru doesn't like... but I couldn't figure out where the kernel is putting the interrupt vectors? Or perhaps it's just that the Subaru external dog is very specific in terms of timing.

I'm thinking my next step is to comment out all the ATU/interrupts in npkern. It looks like this won't affect npkern provided I keep ATU0.TCNT going for the ISO14230 comms? Then I will need to insert regular calls to the duplicated Subaru WDT OVF check / toggle PB15 code.

fenugrec commented 2 years ago

  rjp.wdt_portH = PB.DR.BYTE.H;
  rjp.wdt_portL = PB.DR.BYTE.L;
  rjp.wdt_pin = 0x8000;

Aha ! I really meant "look at the npk disasm and make sure the right FFFFxxxx address is loaded". If you check that code closely I think you have a problem : you're reading the portB DR contents and copying them to the rjp struct. That struct wants the address of PBDR, i.e. &(PB.DR) >> 16 for portH, and &PBDR & 0xFFFF for the portL (give or take a few typos or casts). This roundabout method is needed because that's how Nissan provides that address indirectly.

I tried a 'baby step' and jumped into a very simple kernel

Small steps is always good.

loop. It seemed to work (ie) the ECU remains unresponsive until after a power cycle.

So do I understand you didn't use wdt_tog() as-is for this test ? ( just trying to confirm my theory about PBDR.BYTE.H : )

Continuing, in case I'm wrong :

Perhaps the interrupt vector gets placed in RAM somewhere Subaru doesn't like.

Unlikely - it only needs to be aligned on a 4-byte boundary. npkern is generating the whole IVT in platf_*.c into a plain u32 array : u32 ivt[IVT_ENTRIES];

It looks like this won't affect npkern provided I keep ATU0.TCNT going for the ISO14230 comms?

Correct

Then I will need to insert regular calls to the duplicated Subaru WDT OVF check / toggle PB15 code.

I really really hope that doesn't become necessary. It will complicate some parts of the code.

rimwall commented 2 years ago

Oh wow, what a clown I am! Amazing how sometimes the simplest mistakes are right in front of your eyes. Thanks very much for picking that up! I have changed it to...

    wdt_dr = (u16 *) &PB.DR;
    wdt_pin = 0x8000;

And now.... it works! Successfully connects to npkern and did a full ROM dump at 62,500 bps, which was an exact match with the ROM.

Weirdly I originally tried at 15,625 bps and dm failed after a few seconds. But it works fine at 62,500 bps.

Doing a trial flash is next. I'll have a think about that and review the code before I give it a go. Given it's a SH7058 the current code should work fine, but I want to avoid bricking my bench ECU (I'm not set up for a SH Boot mode recovery). Can you think of any code changes that might be required? I can't.

Seeing as the journey is (hopefully) near completion, it's probably worth some thought as to whether a separate 'subprog' and kernels are created, or the subaru commands are housed within nisprog (which is how I've currently done it)... The only kernel changes so far are WDT_MAXCNT, NPK_SCI, not using the rj_preload struct and the linker and *.s file. I need to tidy up the code here and there so I might as well do it with the final structure in mind. I also need to make sure the various 'stop' and 'exit' commands work.

Thanks for your help... it looks like this might actually lead to something!

fenugrec commented 2 years ago

And now.... it works!

Awesome.

Can you think of any code changes that might be required?

hmm I dont think so. To state the obvious, maybe start with a ROM flash that changes only unimportant bytes in one empty block probably near the end...

separate 'subprog' and kernels are created

Yea I need to think about that more. Definitely a separate "ssmk_SH7058" target like I did in CMakeLists. For nisprog I'm not so sure... I mean I named everything in there with never a thought for other brands. But I guess for a start if it's just a few special commands for Sub it can stay in there. I'd have to look at your changeset, I don't think you've published any code yet ?

kernel changes so far are WDT_MAXCNT, NPK_SCI, not using the rj_preload struct and the linker and *.s file.

Right - I'll most likely want to shuffle some things around once I see everything, but I'm thinking probably add a "mfg_nissan.c" and "mfg_subaru.c" file, for the stuff like rjp struct etc. If your changeset is tidy and clear I don't mind doing that shuffling.

[EDIT] For NPK_SCI, I just pushed a commit that cleaned the remaining hardcoded SCI* instances to instead use NPK_SCI. Now it should only be necessary to define NPK_SCI once in platf.h

rimwall commented 2 years ago

ok, I've cleaned up the code. I'm new to github, so not sure what is the correct thing to do... Do I create a 'fork'? Do I fork the whole of nisprog/npkern? Or only fork the files I changed?

I tried a practice flblock but got an error...

ECU now running from RAM ! Disabling periodic keepalive;
Connected to kernel: SH7058-x-N/A
You may now use kernel-specific commands.
nisprog> setdev 7058
now using 7058.
nisprog> flblock test.bin 15
*** Running in practice mode, flash will not be modified ***
got bad RequestDownload response : 180nm: bad FCCS

Looks like FCCS should have a value of 0x80 (FWE set, FLER and SCO off), but apparently it didn't. Is this because of practice mode? Or is it highlighting some problem?

fenugrec commented 2 years ago

Do I create a 'fork'?

Usually people for the entire repo first, then create a branch, add commits to your branch, and then I can work with that branch (rework as necessary) then merge into the upstream (my repo here).

got bad RequestDownload response : 180nm: bad FCCS

Hm, not sure. You'd have to find a way to report the contents of FCCS - what I've done in the past is copy that reg to a hardcoded RAM location (say,*(volatile u32 *) 0xffff0000 = FCCS or something similar, then read that value with dumpmem.

I think you mentioned earlier your ROM toggles another pin just before entering the kernel ? Is it possible that pin is externally connected to FWE, as a kind of manual flash-disable technique ?

Looks like FCCS should have a value of 0x80 (FWE set, FLER and SCO off), but apparently it didn't. Is this because of practice mode? Or is it highlighting some problem?

rimwall commented 2 years ago

ok, some progress. Figured out that PD8 is the pin linked to FWE. Set PD8 and got past the FCCS error in the practice run.

Then tried a real write, and it seems like the watchdog barked because comms with the kernel stopped and the ECU had reset. Checked the ROM and nothing changed in the ROM.

I'm wondering if there is a large duration event somewhere in the erase/write procedure when interrupts or the vbr aren't operating?

nisprog> flblock test.bin 15 Y
*** FLASH MAY BE MODIFIED ***
*** Last chance : operation will be safely aborted in 3 seconds. ***
*** Press ENTER to MODIFY FLASH ***
Proceeding with flash process.
Entered flashing_enabled (unprotected) mode
Erasing block 15 (0x0E0000-0x0FFFFF)...
diag_l2.c:542: Read/Write timeout.
no ERASE_BLOCK response?

fenugrec commented 2 years ago

ok, some progress. Figured out that PD8 is the pin linked to FWE. Set PD8 and got past the FCCS error in the practice run.

Ah, excellent.

Then tried a real write, and it seems like the watchdog barked because comms with the kernel stopped and the ECU had reset. Checked the ROM and nothing changed in the ROM.

Strange.

Did you change anything to these ?

#define FTDAR_ERASE 0x02
#define FTDAR_WRITE 0x04
#define FL_ERASE_BASE   0xFFFF1000
#define FL_WRITE_BASE   0xFFFF2000

I'm wondering if there is a large duration event somewhere in the erase/write procedure

Erasing a block does take some time but I don't know how that could be causing problems. We've been reflashing 7058's for years with pretty much this same code. Nothing comes to mind right now... maybe try the regular "flrom" (with just the last block modified, as you were doing) ? I forget what codepaths differ between flblock and flrom.

rimwall commented 2 years ago

Did you change anything to these ?

These are unchanged.

Initialization works. Practice works, real does not work. The difference appears to occur after the test of reflash_enabled:

uint32_t platf_flash_eb(unsigned blockno) {
    uint32_t FPFR;

    if (blockno > FL_ERASEBLOCKS) return PFEB_BADBLOCK;
    if (!reflash_enabled) return 0;

    FLASH.FKEY = 0x5A;
    FPFR = fl_erase(blockno);
    if (FPFR) {
        FLASH.FKEY = 0;
        return ((FPFR & 0x06) | PF_FPFR_BASE);
    }
    FLASH.FKEY = 0;
    return 0;
}

fl_erase(blockno) jumps into the on-board erase program per:

static uint32_t (*const fl_erase)(uint32_t FEBS) = (void *) ((u32) FL_ERASE_BASE + 16);

Tried a few things, but same error every time. I haven't tried 'flrom' because that will start by erasing EB0 which will brick the ECU if it fails. And flrom seems to call the same fl_erase function. I did a dump from RAM from 0xFFFF1000 to 0xFFFF4FFF and the erase and write functions are where they are supposed to be, and so is the kernel.

I can't think of why it would work for a Nissan ECU but not a Subaru ECU. The in-built functions must be the same and the kernel is essentially the same aside from the watchdog pin and setting the FWE pin. I am wondering whether the bench power supply is inadequate for flashing. It is a repurposed 110V/240V AC/12V DC power supply which (according to the label) is rated to 2A at 12V. But perhaps it's not up to the task... is there a significant increase in power / current when erasing starts? Maybe I should try hooking the bench ECU up to a car battery...

fenugrec commented 2 years ago

uint32_t platf_flash_eb(unsigned blockno) {
  uint32_t FPFR;

  if (blockno > FL_ERASEBLOCKS) return PFEB_BADBLOCK;
  if (!reflash_enabled) return 0;

  FLASH.FKEY = 0x5A;
  FPFR = fl_erase(blockno);

Can you try modifying the code to exit right after fl_erase(), to see if that one at least returns ?

I haven't tried 'flrom' because that will start by erasing EB0

No, flrom will compare blocks one by one and offer to reflash only changed blocks.

0xFFFF1000 to 0xFFFF4FFF and the erase and write functions are where they are supposed to be, and so is the kernel.

Thanks for checking, that was my next question.

I can't think of why it would work for a Nissan ECU but not a Subaru ECU.

Different thought... have you tried letting the kernel run a bit longer, to see if it stays responsive and stable ?

repurposed 110V/240V AC/12V DC power supply

Hm, easy enough to check... if it sags below 11.5 I would be suspicious maybe.

rimwall commented 2 years ago

Tried a bunch of stuff, but no progress.

Can you try modifying the code to exit right after fl_erase(), to see if that one at least returns ?

It doesn't progress past fl_erase() so I'm pretty sure that's where the problem is. I could try fl_write() without a preceding fl_erase() but the DS says erase first.

flrom will compare blocks one by one and offer to reflash only changed blocks.

Ah, ok. Tried flrom but same result.

Different thought... have you tried letting the kernel run a bit longer, to see if it stays responsive and stable ?

I left it running in the ECU for a few hours and it was still alive when I returned

if it sags below 11.5 I would be suspicious maybe.

The DMM measured 12.2V the whole time. I could try the ECU in the car to see if the issue is somehow related to the other connections, but reading how other folks flash on the bench, I don't think so. I tried with ignition switch on or off and I tried with 'test mode switch' on or off but no difference. Test mode switch is something that Subarus of this age needed connected to get past SID 0x34 to get the kernel loaded.

I also tried turning on/off some other port pins that were set up as outputs, but didn't seem to make any difference.

I'm currently using Ghidra to dig through the inbuilt erase/write functions to see if that can find the problem. Comparing them to ones from a Nissan ECU might be worthwhile to check they are identical. Do you have some Nissan ones you can provide?

One thing I noticed... the Nissan kernel stack is at 0xFFFFBFFC and the Subaru one is currently at 0xFFFF8FFC, so maybe the stack is getting over-written somehow? I can't think why, but I will try with the Subaru stack at 0xFFFFBFFC to see if that makes a difference.

So close, yet so far...

fenugrec commented 2 years ago

Tried a bunch of stuff, but no progress.

On the contrary, you're ruling out possibilities.

Have you checked if the clock freq is weird or FL_FPEFEQ has the right value ?

Comparing them to ones from a Nissan ECU

It should be identical, since that microcode is factory-programmed into the SH7058 by Hitachi/Renesas and cannot be modified. Still, I have a dump here (erase program at FFFF1000 and writing prog at FFFF1800 ) if you still want to compare.

SH7058 microcodes ffff1000.zip

rimwall commented 2 years ago

Been trying all sorts of things and finally worked it out with a combination of Ghidra digging and guesswork. Somehow the ECU pulls the NMI pin low just before starting erasing / flashing, which triggers a 'non-maskable interrupt'. It's not done by the SH7058, so it must occur somewhere else on the PCB. This interrupt was being directed to die_trace() resulting in the kernel dying. I redirected the NMI interrupt to the ATU interrupt for the wdt pin, and now...

nisprog> flblock test.bin 13 Y
*** FLASH MAY BE MODIFIED ***
*** Last chance : operation will be safely aborted in 3 seconds. ***
*** Press ENTER to MODIFY FLASH ***
Proceeding with flash process.
Entered flashing_enabled (unprotected) mode
Erasing block 13 (0x0A0000-0x0BFFFF)...
writing chunk @ 0x0BFF80 ( 99 %,  4275 B/s, ~    0 s remaining)
Write complete.
Reflash complete.
nisprog> dm flashtest.bin 0 1048576
npk dump @ 0x000FFF00,  4920 B/s,    0 s remaining

...it works!

I used the kernel to turn off all the port pins one by one, but none appear to be connected to the NMI pin. But, it seems like there is some way of influencing the voltage on the NMI pin because, in another part of the ROM, a brief period of the NMI being low results in a DMA being set up to access RAM. Not sure if you have any ideas from seeing this sort of thing before?

One other thing I noticed when I was digging through the kernel in Ghidra... set_imask() isn't quite right when passed the previously saved imask. The compiler turns set_imask(imask_save) into...

The bug occurs at 0xffff3d1c when r0 is overwritten. There is another problem at 0xffff3d22 when r0 is (effectively) or-d with itself. The net result is that all the sr bits (except the interrupt bits) are set to 0. So things like the T bit are lost.

I have bypassed this problem by using set_imask(0x07). Not sure how to fix it. I checked the github versions of npkern and they don't have the above problem. So perhaps your compiler is a different version? I think I'm using v13.01 sh-elf-gcc/g++ that came with the Renesas HEW package (although I'm never entirely sure what Cmake does in the background).

Wow, step 1 of the journey over. Any suggestions before I let the folks on RR know?

fenugrec commented 2 years ago

Nice work. That NMI thing was unexpected... Does the pin keep toggling or is it just pulsed once ? You could also make a short empty interrupt handler for NMI, instead of having it trigger a wdt_pin toggle - but maybe it doesn't matter.

brief period of the NMI being low results in a DMA being set up to access RAM. Not sure if you have any ideas from seeing this sort of thing before?

Hm, can't think of anything... I don't think Nissan ROMs ever use DMA at all.

One other thing I noticed when I was digging through the kernel in Ghidra... set_imask() isn't quite right

Hmm, deja vu... I fixed a similar problem in 7d68624d866dc94e100b857da1feab37d5adf7fb , but different versions of gcc behave differently - for example when I compile the kernels (gcc 10.2), the set_imask() function is inlined and looks completely different from what you posted.

What seems to be happening in yours, is at '3d1c:

mov r1,r0
and 0xf0, r0
mov r0, r3

is actually the mask &= 0xf0; statement that got moved right in the middle of the inline asm. I think gcc is "allowed" to do this but apparently changed behaviour over the years...

I think I'm using v13.01 sh-elf-gcc/g++ that came with the Renesas HEW package

HEW should come with "shc", the original hitachi / renesas compiler, and is completely different from gcc. As I recall the KPIT gcc build is very old (probably based on gcc-4.9 or something)

Wow, step 1 of the journey over. Any suggestions before I let the folks on RR know?

Let me check that inline asm thing a bit. Also, can you check how your compiler assembled the set_vbr() calls ?

fenugrec commented 2 years ago

Well it turns out most of the stuff in functions.h and extra_functions.h was still not written properly - it looks fine but under certain conditions will compile to broken code like what you saw (and here, with gcc-10, was also broken but in a different way).

I pushed a tentative fix in https://github.com/fenugrec/npkern/compare/asmfuncs_inline . Everything looks sane when compiled here, but I don't have time to test on real metal for the moment. Let me know how it goes.

rimwall commented 2 years ago

That NMI thing was unexpected... Does the pin keep toggling or is it just pulsed once

I've set up a new ISR handler and a simple RAM counter to find out, I can try later. My guess is that NMI briefly drops low every time extra power is applied to the SH7058 to support erasing / writing a block.

check how your compiler assembled the set_vbr() calls

set_vbr() looked ok to me... I wonder whether functions.h works fine with the original Renesas compiler, but doesn't work with more modern compilers...?

I pushed a tentative fix. Let me know how it goes.

Thanks for that. Unfortunately get_vbr() won't compile using the Renesas HEW compiler...

C:\NisProg\Editing\NPK_Master\functions.h: In function 'get_vbr':
C:\NisProg\Editing\NPK_Master\functions.h:169:6: error: expected '(' before 'inline'
C:\NisProg\Editing\NPK_Master\functions.h:170:3: error: expected identifier or '(' before string constant

Next thing for me to do is adding some comments in USING.txt for Subaru SH7058.

fenugrec commented 2 years ago

Unfortunately get_vbr() won't compile using the Renesas HEW compiler...

No no, like I said HEW's 'shc' compiler is totally different, and incompatible, with gcc. There is zero chance of npkern compiling properly with shc.

The stuff in functions.h is very compiler-specific, but should definitely work with the KPIT gcc package you were using ?

rimwall commented 2 years ago

Hmmm, maybe I was using the wrong terminology. I'm not very familiar with running multiple different compilers for different platforms. This is what I have been using in my npkern\cmake\toolchain-sh.cmake file:

set(CMAKE_C_COMPILER "/Progra~2/Renesas/Hew/Tools/KPIT/GNUSH-ELF/v13.01/sh-elf/bin/sh-elf-gcc.exe" CACHE INTERNAL "")
set(CMAKE_CXX_COMPILER "/Progra~2/Renesas/Hew/Tools/KPIT/GNUSH-ELF/v13.01/sh-elf/bin/sh-elf-g++.exe" CACHE INTERNAL "")

This has always worked up until the get_vbr() issue above. And this is what what I have been using in my nisprog\CmakeLists.txt file...

set( CMAKE_CXX_COMPILER "/MinGW/bin/g++.exe" )
set( CMAKE_C_COMPILER "/MinGW/bin/gcc.exe" )

I had to play around with these files to get cmake to work for me, so I could be doing this incorrectly? Should I be using MinGW/bin/[gcc/++].exe for both?

fenugrec commented 2 years ago

Thanks for clarifying. I thought you had Renesas shc installed, but it looks like you're using the correct one.

I'm surprised you need to manually set CMAKE_C_COMPILER etc... if you just set CMAKE_FIND_ROOT_PATH and possibly tweak the find_path ... HINTS part (in cmake/tooltoolchain-sh.cmake), does it not find the compiler automatically ?

In any case, I did make a typo in my patches, that is now fixed in that same branch asmfuncs_inline .

Could you run "sh-elf-gcc -v" - for reference, I'd like to know what version they used...

rimwall commented 2 years ago

that is now fixed in that same branch

Yep, compiles now, thanks. I'll have to check it in Ghidra later.

Could you run "sh-elf-gcc -v"

Provides: gcc version 4.7-GNUSH_v13.01 (GCC)

I've been trying to use *(volatile u32 *) 0xffff8000 = 1 in main() and *(volatile u32 *) 0xffff8000 += 1 in cmd_dump() (as a test of something I can use to count the number of NMI triggers). It compiles ok, but the value in RAM (extracted using dm) is always reported as 0x00000000. Seems like such a simple thing, but clearly I'm doing something wrong?

I started looking through a SH7055 Subaru ROM today. The good news is that the access technique is almost identical to the SH7058. I have yet to figure out how to tell a 180nm version from a 350nm version just by looking at the ROM.

fenugrec commented 2 years ago

Yep, compiles now, thanks.

Cool. You can also look at disasm with sh-elf-objdump -d ssmk_SH7058 , although it's not as nice.

Provides: gcc version 4.7-GNUSH_v13.01 (GCC)

Thanks. So, a pretty old build.

I've been trying to use *(volatile u32 *) 0xffff8000 = 1 in main() and *(volatile u32 *) 0xffff8000 += 1 in cmd_dump()

in cmd_dump ? it'll only increment when receiving that particular SID request, but that doesn't explain why it's always 0. You'd have to check the disasm...

I have yet to figure out how to tell a 180nm version from a 350nm version just by looking at the ROM.

Similar problem with Nissans. We have a pretty decent db of known ECUIDs that could provide that info, but I've also been considering adding a "size detect" feature that basically checks if there's more than 512k of flash ROM (I believe reading past that boundary on the 7055 just wraps around, i.e. you read the same data as at address 0 ). That will be a separate discussion though.

rimwall commented 2 years ago

Here's the new version of set_imask()...

Looks like it works now.

I'm surprised you need to manually set CMAKE_C_COMPILER

When I installed Cmake, Visual Studio was also installed and I struggled to redirect Cmake to another compiler even after I uninstalled Visual Studio. I never figured out why. I eventually got Cmake to work by setting CMAKE_C_COMPILER. I really know nothing about Cmake so once it was finally working, I haven't wanted to try anything different!

in cmd_dump ?

Yes, I was trying it here first so that I didn't have to do a flash every time to see if my counter was working.

You'd have to check the disasm...

That's what is confusing. The disasm looks fine (to me?), so I'm wondering whether there may be a minor bug in how dumpmem works when it is a RAM location. Perhaps associated with dumping < 32 bytes. I've been looking at the code, but I haven't found anything so far. The command I was using was dm outfile.bin 0xFFFF8000 4

(EDIT: to show correct portion of the assembly). Note that r2 is set to 1 higher up.

rimwall commented 2 years ago

I haven't checked every single thing in the ROM, but for SH7055S Subaru (180nm at a guess), the only differences appear to be:

Allowed RAM area for RAMJump is 0xFFFF6000 to 0xFFFFDFFF
RAMJump destination is 0xFFFF6004
RAM start/end/length are clearly different

At the moment I have 0xFFFF3000 hard-coded in nisprog so I'm wondering what is the best way to turn the above details into variables? Could flashdev_t be expanded to include extra Subaru-only values like RAMJump destination? That would require the setdev command before loading the kernel. Or some other approach?

Aside from residing at 0xFFFF6000, the kernel for 7055S could probably be identical to the 7058 one, as long as it doesn't put the stack too far from the kernel. Is it best to create a whole new kernel file and compilation pathway? Or somehow make the 7058 kernel flexible?

Once that is done, I'll need a volunteer from RR to test it on a 7055. And also find someone to supply a confirmed ROM for a 350nm 7055.

rimwall commented 2 years ago

One step back, two steps forward. Turns out, the mystery interrupt isn't a NMI. I had been fooled into thinking an NMI was the source of the interrupt by a non-clean build of the kernel. It is actually a UBC interrupt. In a Subaru ROM, the UBC is set up to trigger an interrupt (level 15) upon a write cycle to any non-RAM address. Because nisprog sets the interrupt mask at 0x07, it allows the UBC interrupt to occur. The fix is to disable the UBC interrupt by turning on bit0 in UBCR. Writing a block is significantly faster now, because its not handling an interrupt every time it writes to the ROM!

I'm wondering whether there may be a minor bug in how dumpmem works when it is a RAM location

I got this wrong because dumpmem actually diverts to cmd_rmba() for RAM addresses. When I put the counter in cmd_rmba(), it worked fine.

So, all mysteries solved. I think 7058 is basically ready for release, and 7055 (180nm) can be done with a few small tweaks (see earlier post).

fenugrec commented 2 years ago

It is actually a UBC interrupt.

Ah, nice work figuring that out. Indeed UBC is one of the few interrupt sources that isn't be controlled via the IPRx priority regs !

I got this wrong because dumpmem actually diverts to cmd_rmba() for RAM addresses.

Right - I think I implemented it that way mainly to make ROM dumps as fast as possible, by making the dumpmem requests as short as possible, which meant restricting the address space.

So, all mysteries solved. I think 7058 is basically ready for release

Excellent. Let me know when your code is "clean enough", I'll review and merge, then we can open a separate ticket to look at the 7055 situation vs ramjump addresses.

rimwall commented 2 years ago

Excellent. Let me know when your code is "clean enough", I'll review and merge

ok, I've updated the nisprog and npkern files on github. Whilst it has absorbed a lot of time and guesswork, my npkern coding changes are miserably small! Setting the FWE pin and disabling the UBC resides in main.c, but it could equally reside in platf_unprotect() in plat_flash_705x_180nm.c. And there is probably a better way to set up the external WDT pin toggling. It really depends on the best way to set up these files for multiple car brands.

I've also expanded the USING.txt file (for Subaru commands) and added a couple of files documenting how the Subaru ROM works.

np_cli.c still has the 0xFFFF3000 RAMJump address hard-coded in cmd_sprunkernel(). Given that 7055 uses a different address, there is probably a better way to set this up using a variable.

Once you have reviewed and merged, I'll do a fresh download of the code and do a test recompile / execute, and then it will be ready for broader use.

rimwall commented 2 years ago

Let me know if I can help with tidying up the code.

I've had a closer look at other ROMs. The SH7055 ROM (180nm) should be straightforward. I've dug into a 2009 ROM in detail and it looks like sometime between 2005 and 2009, Subaru changed the comms technique. The 2009 ROM uses the SCI interface for the normal SSM commands, and uses the CAN mailboxes for UDS commands like loading the kernel and doing the RAM Jump. Not sure if freediag can do the CAN communication...?

fenugrec commented 2 years ago

Let me know if I can help with tidying up the code.

For the moment, not really. Next time though please work on a separate branch instead of directly on master. You'll see why shortly.

Not sure if freediag can do the CAN communication...?

Unfortunately, no.

fenugrec commented 2 years ago

[x] It looks like you didn't change the external WDT period from the default 2ms. Is this ok ?
[x] is that extra wdt_tog() after platf_init() in main still necessary ? Your comment says "may not be necessary"
[x] Please test the code in this temp branch, https://github.com/fenugrec/npkern/tree/ssmk_test . You should try both the precompiled kernel and the one you will compile yourself, to make sure the build sys is robust

rimwall commented 2 years ago

Excellent, thanks, really nice approach to have a separate mfg.c.

Next time though please work on a separate branch

Sure, no problem

CAN communication... Unfortunately, no.

Had a feeling that would be the case. Is this because different code would be required for the CAN messages (I see the format is very different), or is this because different hardware would be required (I see USB to CAN devices often use various ICs, and CAN uses two lines with different voltages H & L)?

the default 2ms. Is this ok ?

Sorry, my mistake. All my kernels had been using 6.6ms (MAX_WDTCNT = 4125). However, I just tried at 2ms and it still works. Based on the Subaru ROM, it should be 6.6ms but perhaps the external circuit only has a maximum and not a minimum.

extra wdt_tog() after platf_init() in main still necessary ?

All my kernels included it, but I just tried without and it still works. So, not necessary.

Please test the code in this temp branch

Superb! The precompiled kernel works and a fresh compile also works. Because of my set-up, I had to manually set the C_ and CXX_COMPILER variables in toolchain-sh.cmake. One thing I noticed... when loading the kernel, the pre-compiled kernel reported a version number, whilst the one I compiled locally still had 'x' as the version number. I presume that's somehow related to my build environment.

I imagine you already know this, but a nisprog update will be required for the spconn and sprunkernel commands and the various Subaru SID functions that sit behind those commands. I had put this in the master branch in my fork, shall I copy these across to a new branch? In my fork?

fenugrec commented 2 years ago

Had a feeling that would be the case. Is this because different code would be required for the CAN messages

The problem is at multiple levels :

freediag has no L0 support for any hardware that supports CAN (well, except for elm327, but there is no code for that)
there are none of the higher-level layers for iso15765 / iso-TP etc

Sorry, my mistake. All my kernels had been using 6.6ms (MAX_WDTCNT = 4125). However, I just tried at 2ms and it still works.

Ok. Do you think we should still set MAX_WDTCNT for subaru, or leave it at 2ms and hope it always works ?

All my kernels included it, but I just tried without and it still works. So, not necessary.

Perfect. Probably having a wdt period of 2ms helps, the worst case interval on kernel startup would be 6.6 + 2ms ...

reported a version number, whilst the one I compiled locally still had 'x' as the version number. I presume that's somehow related to my build environment.

Hm. In your build dir, there should be an auto-generated "version.h" : #define NPK_COMMIT "137-001af1c"

that is produced by the script cmake/gitversion.cmake . Your source dir does need to be a git repo for this to work however.

I had put this in the master branch in my fork, shall I copy these across to a new branch? In my fork?

Yes please. I'll look at the nisprog situation later

rimwall commented 2 years ago

well, except for elm327, but there is no code for that

I have seen cheapo USB to OBD2 CAN cables based on a elm327. Without having a clue how much work would be required, I wonder whether expanding freediag to elm327 / CAN is a do-able challenge...? Maybe there is some existing open-source code somewhere that can be incorporated into freediag?

Do you think we should still set MAX_WDTCNT for subaru, or leave it at 2ms and hope it always works ?

Off memory, there is a very old forum discussion related to SH Boot mode saying that it should toggle at 125 Hz (8ms), but values between 120 Hz (8.3ms) and 150 Hz (6.7ms) should be ok. Seems like these were derived by trial and error. The ROM toggles at 6.6ms, although it does it without using an interrupt - instead it checks at various points in the code for overflow. Which means it won't toggle at exactly 6.6ms. I guess its possible that my bench ECU has a 'lazy' external watchdog but other generations of ECUs may have tighter tolerances. If it's a simple matter of moving the #define for WDT_MAXCNT to mfg.c then perhaps it is better to have it set to 4125 for Subaru ECUs. It will make it easier to change in the future if required. I can have a go at this if you like?

Your source dir does need to be a git repo for this to work however.

Ah, ok, that's why it remained as version 'x'

Yes please.

Done. The new nisprog branch is called 'ssmk_test'.

fenugrec commented 2 years ago

I have seen cheapo USB to OBD2 CAN cables based on a elm327.

Yes, and those usually have a partially-crippled ELM327 IC clone that barely follows the elm docs anyway. I'd have to be well paid before I spent any time trying to support those.

Maybe there is some existing open-source code somewhere that can be incorporated into freediag?

If you ever find cheap J2534 hardware that supports K+CAN, that might motivate me to look at j2534 support, but it's still a stretch. I find it difficult to justify spending tons of free time on this project.

perhaps it is better to have it set to 4125 for Subaru ECUs. It will make it easier to change in the future if required.

No problem, already done. I just moved stuff around in platf.h

I also merged all this in master . We can open new tickets for other discussions, I think this one is getting quite long enough and can be closed.

fenugrec / npkern

Revised lkr and start files for Subaru SH7058 #7