Closed VK6TT closed 8 months ago
If just gets wierder. I stripped out everything except the words to toggle the pin off and on. My test code was:
: b0
set_pins
slowclk
[ ' .0 ]M! [ ' .0 ]M! [ ' .0 ]M!
[ ' .0 ]M! [ ' .0 ]M! [ ' .0 ]M!
[ ' .0 ]M! [ ' .0 ]M! [ ' .0 ]M!
[ ' .0 ]M! [ ' .0 ]M! [ ' .0 ]M!
[ ' .0 ]M! [ ' .0 ]M! [ ' .0 ]M!
[ ' .0 ]M! [ ' .0 ]M! [ ' .0 ]M!
[ ' .0 ]M! [ ' .0 ]M! [ ' .0 ]M!
[ ' .0 ]M! [ ' .0 ]M! [ ' .0 ]M!
[ ' _TXoff ]M!
0 CLK_DIVR C! _TXoff
;
If I define this in RAM I get the following waveform:
It appears as if after every second pulse there is an extra NOP instruction. But the assembled opcodes show no such thing. I have a nice repating and consistent pattern of 1F ( off) and 1E (on) bytes. No extra NOP's ($9D) either: AA CD 95 A7 CD 96 17 72 1F 50 A 9D 72 1E 50 A 9D __r_Pr_P__ BA 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 r_Pr_Pr_P__r CA 1E 50 A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F _Pr_Pr_P_r DA 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 1E 50 Pr_Pr_P__r_P EA A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A r_Pr_PrP FA 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 1E 50 A 9D _r_Pr_Pr_P 10A 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 r_P__r_Pr_Pr 11A 1E 50 A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F _P__r_Pr_Pr_ 12A 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 1E 50 P__r_Pr_Pr_P 13A A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A __r_Pr_PrP 14A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 1E 50 A 9D _r_Pr_Pr_P__ 15A 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 r_Pr_Pr_P__r 16A 1E 50 A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F _Pr_Pr_P_r 17A 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 1E 50 Pr_Pr_P__r_P 18A A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A r_Pr_PrP 19A 9D 72 1E 50 A 9D 72 1F 50 A 9D CD 84 50 83 50 _r_Pr_P__P_P 1AA C6 CD 82 FC CD 94 6 81 54 1 4 64 75 6D 70 0 ____T_dump
Now the datasheet clearly says that BSET and BRES (the opcodes corresponding to 1E and 1F ) both take 1 cycle.
Maybe I'm just dreaming. Let's try compiling B0 to NVM and try again: Now I've got the leading pulse in each pair staying high for an extra clock cycle.
To recap, executing from RAM results in something using a clock cycle every second low period. But when executed from NVM the same word now uses an extra clock cycle on alternating high periods.
The only thing that was consistent was when I patched the address of B0 into the boot address and bypassed Forth altogether. That gave me the same waveform as when I executed the NVM version of B0. So I'm ruling out any strange overhead issue.
Maybe I should ask ST?
Two things come to my mind:
I agree Thomas that the way the pipeline is filled could be the cause of this. The programming manuals makes an off hand reference to actual clock cycles needed for an opcode being longer than stated by pointing out "In some cases, depending on the instruction sequence, the cycle taken could be more than that number." Add to that the fact that the pipeline fills differently for RAM compared to Flash. I also not that "The instruction access from Flash Program memory is 32-bit wide and it is performed from an aligned address i.e. 0xXXX0, 0xXXX4, 0xXXX8, or 0xXXXC."
So I tried three things today:
None of these fixed the situation. Then I thought, instead of using a NOP, use a Jump instruction to force the pipeline to reload.
: _TXoff [ 0 PC_ODR ]C! [ $20 C, 0 C, ] ; \ Jump relative 0 bytes forward since program counter has already been incremented by 2 bytes
When I built up the long string of assembly op-codes using ]M! this worked a treat. I ended up reverting to the BSET BRES instructions since there was no advantage to writing a byte directly as shown above.
But then it was obvious. Dispense with the ]M! command in the transmitter code and use the inherent JP RET instructions in Forth to force the pipeline to keep refreshing.
This resulted in a consistent pattern being sent though the high time was still greater than the low time. I couldn't quite get to the bottom of this but I brute forced it with a few NOP's and bumped up the CPU speed slightly so my symbols were all in the desired speed range.
The inherent clock rate in my bit banging is around 16% of the CPU clock speed. Most of the time no-one clocks a STM8 so slowly but I was on a quest for minimising current with the device I had to hand.
One of my rule of thumb's now is casual bit banging only works up to a clock rate approaching 5% of the CPU clock . Any faster and you need to check what is happening since it can be quite different than you expect.
A big learning curve this week for me! Thanks for your help.
I would never have noticed this if I hadn't slowed the CPU clock down to 15.625kHz and tried to do everything one clock cycle at a time. Trying to save those last few erg's of battery life has cost me hours of frustration!
The symptom, shown on the following picture, is that at times something might be consuming CPU cycles behind the scenes or the clock skips forwards or backwards.
This picture shows the influence of a delay, denoted by the shaded region, in toggling the pin.
A pin is set high or low with:
which compiles correctly as shown in the comments above. The complete byte "pattern" is also correct:
00111 00111 001 01 001 01 001 01 001 01 0
95CC 72 1F 50 A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72
95DC 1E 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72 1F
95EC 50 A 9D 72 1E 50 A 9D 72 1E 50 A 9D 72 1E 50
95FC A 9D 72 1F 50 A 9D 72 1F 50 A 9D 72 1E 50 A
960C 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D
961C 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A 9D 72
962C 1E 50 A 9D 72 1F 50 A 9D 72 1F 50 A 9D 72 1E
963C 50 A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50
964C A 9D 72 1F 50 A 9D 72 1E 50 A 9D 72 1F 50 A
965C 9D 72 1E 50 A 9D 72 1F 50 A 9D 81
So there is no extra op-codes that could explain why the pin toggled one cpu cycle later than it should have.
It's not related to any timer because I have turned all of those off after disabling them. Ditto every other peripheral. My current minimising code for the STM8003 is:
And just to be sure, and after removing any terminal input/output routines at the start of my main word, I effectively bypassed Forth and on reset jump directly to the start of my code. So that rules out background, idle and any other Forth influence. This problem still exists even running the pure assembly code.
The pattern shown above was captured as the first of 5 bytes. I was sending 3 on, 3 off codes on a repeating basis. Fortuitously I captured a ON, OFF, OFF, OFF, ON sequence. The two ON sequences were identical in all respects including the problem.
The three OFF bytes were also wrong, but identical. The first start pulse was missing the leading high period. And two of the one bits had an extra high period.
If the CPU had been running faster I would probably not have noticed this because the code would have changed the oin state and then the additional time period needed to get the pulse length would have masked the problem.
Does anyone have any suggestion please on what might be going on?
Kind regards Richard
Full code