Reverse engineer and firmware rewrite

kablek commented 3 years ago

I must admit I was very sad when I received my S42Bs and they didn't work well (display issues, skewed prints, steps missing, ...). At the time fixes for said problems were not available on this github page (and even today some are not really sure to work).

The code is not well documented, it is messy and apparently it doesn't work too well. Would anyone be interested to cooperate/make/use fresh code and decent open source firmware?

I have experience with STM but I am not ready to commit to such project if there is no interest in making such project happen if there is not enough interest.

swanepoeljan commented 3 years ago

I bought one of these just to experiment with, haven't installed it in my printer and also don't intend too (for the moment). The idea is more to see if I can use it for a basic robot arm project I am planning in the next weeks/months.

In general it seems to work okay but there are lots of features missing for what I have in mind, some of these are things like reading back the angle through the UART or getting notified when it stalls, etc. So as a starting point I forked the repo and started to add some of these features (on my local clone). While doing this I also completely changed the serial interface by adding TX and RX buffers and also change the package structure (made it smaller) and so far it works. Next I want to start cleaning up the rest of the code and if I find any issues try to fix them. So my goals are a bit different but maybe it could still be beneficial to others using it in their printers.

So far I have not really fixed any bugs, just added custom features but will try to keep updating my repository with the latest changes.

kablek commented 3 years ago

I see, quite nice. But I believe it would be better to do proper rewrite. By having said functions in mind stuff will run much better, be easier to maintain and add further functions. What do you think?

swanepoeljan commented 3 years ago

A rewrite sounds like fun, but maybe before we go down that road it might be good to identify the problems with the current code. I agree the code structure is messy and all over the place but from what I have seen so far some of the techniques they used seems to be not so bad. It's just to avoid that our rewrite produces nice clean code but does not solve any of the real issues.

I also had a look at the Mechaduino code and I think they took a lot of the ideas (and code) from there. Maybe its also a good project to use as reference.

kablek commented 3 years ago

Indeed, most of the problem seem to be related to step input code (seems like MCU is not reading input fast enough?) The drivers are also quite loud (PWM frequency? easy to set PID from LCD, maybe even pid auto tune?), There are problems with OLED display, etc...

Mechaduino code is much nicer looking, there are hardware similarities but I believe it is not the same.

swanepoeljan commented 3 years ago

I wonder how many of the issues are due to bad soldering, as for some people it works well and for others not so much. The unit I received actually works pretty well (on the bench though), never had any OLED issues and seems smooth and quiet. Think I compiled and flashed the firmware from this repository a day or two after receiving the unit so maybe there are some differences between the stock version and the one on here...

Looks like they use timer1 to count the steps and have the STEP input set as the clock source so it should be very fast. Maybe there is some issues with the way they configured it (eg. mode1 vs mode2)... Before the STEP pulse even reaches the MCU is first passes through an opto-coupler so the signal should also be nice and clean... seeing the inductor I assume they have a switching regulator, maybe it's a possible noise source... I have a function generator which I could hook up to the step pin to see at which frequency and duty it starts to fail, something I could try in the next days.

As for the PWM, seems that they use timer3 and if my math is correct it runs at 187.5kHz, which should be okay as far as audible noise goes. Maybe the PID is causing some oscillations...

PID auto tune would be nice!

Quas7 commented 3 years ago

PWM at 187.5kHz might explain, why the OLED communication gets confused. The default setting ist 380kHz (+/-tolerances) which is quite close to the first harmonic at 375kHz of the PWM signal. https://github.com/bigtreetech/BIGTREETECH-S42B-V1.0/issues/16

AbeFM commented 3 years ago

My board also came with the OLED seeming fine, after reflashing it seemed ok, but neither time was it on more than ten minutes.

I didn't see obvious bad solder joints, but I didn't look that hard. I am having issues programming it/STLink stuff.

I would be interested in helping with the code. I'm totally NOT a programmer, but with some support perhaps I could be of use?

Perhaps something just a bit higher level would be good for serial communications - these checksums and converting everything to hex, etc, could be done on the other end, just like Marlin.

I'd like to see PID autotune, I've fought it with car idle and all sorts of weird spots - it may be PID isn't the best approach for something with real delay like belt-driven masses, and some >1 kg prints will weigh the bed up to the point your settings may need to vary with bed mass (getting ahead of ourselves).

Oh - also, you're right, at first, I recompiled the printer to take send longer pulses to the board, as mentioned in the docs. Minimum Pulse Width, etc.

kablek commented 3 years ago

alllll righty, took me some time due to other stuff (pretty much didn't touch printer for a while since last few days). I did some testing and researching, also looking at other S42B related repositories and forks. Great job on cleaning out the code and making it somewhat readable! Especially true step project! I have pretty much reverse engineered hardware out of software and some other data I could find and I have come to some conclusions/hypothesis:

PWM at 187.5kHz doesn't make much sense. Said PWM frequency comes from system clock (48MHz) divided by value of Auto reload register of STM MCU (which is set to 256). Thing is, from what I got from other data (haven't yet traced parts on actual PCB - so this still needs confirmation), PWM outputs are connected to motor drivers VREF pins, which set current through the coils, BUT it seems to use low pass filter on that connection. Some data from other repositories show values of 1kohm resistor and 100nF capacitor (which still needs confirmation from actual hardware) meaning we are having cutoff frequency of about 1,5kHz. ARR register value of 256 also means there is only 256 distinct pulse widths available, which is limiting factor in micro-stepping selection/accuracy. My proposal would be to raise ARR value to 1024 which would drop PWM frequency to 46.875kHz which I believe would still work fine or even better with filter, make far less EM noise and also increase micro-stepping performance. Note that changing PWM frequency on MCU will NOT change PWM modulation frequency of motor driver. That is fixed to approx. 20-40kHz I believe (fixed off time off 25us according to datasheet)

OLED comunication is great example of bad coding! hard coded values everywhere, no control over timing and bit-banging what could have been done by hardware peripherals. Now there have been proposals for changing OLED clock speed which lead to mixed results - but I guess mostly efforts failed. I believe reason for that is, that OLED oscillator frequency doesn't do anything with data interface input sampling, even block diagram in OLED controllers datasheet shows internal oscillator only connected to graphics RAM and output drivers. It also would not make much sense with data clock at few MHz since internal OLED oscillator is about 250-550kHz. What I do feel might aid performance (but need to try it out first once again) is disabling change of D/C pin to high at the end of OLED_WR_Byte. after all, every time you write data in this case makes this pin go instantli high and back to low, which might write command byte instead of data making screen go BRRRRR (OFC - this is just my theory).

What I also found out is, there seems to be multiple versions of S42B board available, with differences as far as having different MCU, those seem to also have different pinouts MAKING THEM INCOMPATIBLE WITH THIS CODE!!! And that is just another reason for proper rewrite making source code, that can run on all versions of S42B board (maybe even some other closed loop stepper boards in the future)

Quas7 commented 3 years ago

Somebody with more embedded coding skills and time than me might be able to track down the reason for this kind of rare SCLK glitches as they are likely throwing the OLED off track. https://user-images.githubusercontent.com/1175755/97092636-cf376280-1645-11eb-8711-302a26f9cb0b.png

kablek commented 3 years ago

I am also trying to address problem with SCLK glitches, there can be MANY reasons, speed limitations (pin or bus) code optimization settings etc. I am suggesting to also rewrite OLED code in the way, that clock is controlled by peripheral hardware based timer and interrupt driven. From what I see in the picture of logic analyzer it somehow seems like incomplete transfer? Maybe there is a problem with interrupt driven function doing something unexpected.

@Quas7 do you have more data about glitches? maybe even in a way to see whole waveform, not just screenshot?

Quas7 commented 3 years ago

@kablek for my setups the OLED corruptions are very rare and I did not manage to capture any glitch that actually resulted in a corruption. Even the SCLK glitches are too rare to get them just by brute force triggering. I might take another shot at this by defining some timing dependent trigger on SCLK, e.g. triggering on deviations from normal risigin to rising edge spacing. But, I did not find this option for my analyzer yet.

But what do you mean with "whole waveform" - you mean an analog measurement with two channels for SCLK and DATA maybe? Maybe this helps: a longer logic analyzer trace showing the clock-generation timing issue for DATA-low I fixed with a few NOPs

kablek commented 3 years ago

With whole waveform I mean one full transfer before and after the corrupted SCLK. It would be good idea to check if transfer before and after are correct.

I have started rewriting OLED code to use timer for clock generation and completely change the way it works. I currently don't have much equipment at home (I am in the middle of moving to another appartment so yeah) so I have to think stuff through pretty good. I might also try reducing PWM frequency, it is far from full rewrite still, but it is a start. If it works I will upload code/binaries for you to test it out.

Quas7 commented 3 years ago

Unfortunatelly, I did not save all the raw data streams and only snipped a few shots.

Also not sure, if we would have to completely rewrite the SSD1306 communication as there is already the SPI LL module included? https://documentation.help/STM32F0xx/ But I did not go down this rabidhole to check out, if this is possible in parallel to the SPI for the TLE angle sensor nor if it is software-SPI capable.

EDIT: Or is the OLED screen hooked up by any chance to hardware-SPI pins? No, it is not as the hardware SPI2 is not availabe on this type of package (only on the 64pin package) And the SPI LL driver seems to be bound to hardware. So timer based SCLK it shall be. ;)

If you reduce the PWM frequency we might need to increase the tau of the RC network in hardware to keep the voltage stable at Vref . Otherwise the motion would get unstable. Maybe helpful for rewriting in general: https://github.com/makerbase-mks/MKS-SERVO42B/issues/8#issuecomment-703336861

kablek commented 3 years ago

Unfortunately, no. OLED is not connected to SPI capable pins... So we are stuck with bitbanging it. But, I completely rewrote it, now I am reviewing my code. If I manage to move my ass downstairs to printer, I might even test it out soon. I did use TIM15 timer to generate PWM which is clock signal. let me just see how it goes and I shall report.

kablek commented 3 years ago

Small update and findings: I have roughed in the concept for brand new code and wrote first version of it. Unfortunately it doesn't work (yet) and I currently have very limited set of debug tools. The idea behind my version of OLED code is to use PWM based output on SCLK pin (around 1MHz in current testing version) and trigger interrupt every time next bit has to be written and handle SPI protocol stuff. Interrupts being triggered at 1MHz is a bit sketchy and fast for 48MHz MCU, but I sure can try it, maybe drop speed later on.

I might also try creating a system where screen updates only where needed and not whole screen all the time. This could allow me to drop SCLK speed or even make it dynamic, according to actual needs. But first I need to make my version of driver working in baseline version.

As for PWM frequency it is a trade-off, I believe RC filter consists of values 1kOhm and 100nF (STILL UNCONFIRMED!!) which puts filter at ~1.5kHz. I did some calculations of what would happen if we drop PWM frequency, pretty much only thing that happens is that voltage ripple on VREF pin should increase. By dropping frequency we increase ripple, but increase resolution. double resolution -> double the ripple. Problem with increasing tau of filter network is that it increases time needed for voltage to reach desired level. Acceptable level of ripple depends a lot on desired target current level (motors running on higher current should be less affected by the ripple, since ripple is fixed value. There probably is some room for improvement here, but I am still "brainstorming it". I am first focusing on OLED performance and rewrite first, one step at the time. Next step is actually figuring out why step input pin is so slow (I suspect over done filtering on step input pin)

kablek commented 3 years ago

We are making PROGRESS! ghup1 In the picture we are running completely new code for OLED screen, based on interrupts/timer15. OLED still does it's oled problems from before, but it is much more stable in my case. I have still more testing and improvement to do, I want to share and discuss some findings: I am mostly working with 1MHz PWM output as SCLK. It seems our little OLED friend prefers longer setup times for pins (makes it quite more stable) but, dropping clock frequency lower (tried 500kHz SCLK and it was horrible) have adverse effects?

Using my fresh code + jumper wires pretty much makes OLED stable, this suggests maybe we could benefit also by trying to shield screen from EM stuff? maybe some sheet metal, aluminum foil? maybe some extra decoupling caps?

I have also tried dropping motors PWM frequency by setting divider to 4, it didn't do much effect on OLED or motor movement. I have absolutely no means of looking at waveform though, my cheap ass Chinese oscilloscope-ish thingy is in my tool bag that is on remote location and I have no logic analyzer anyway. I don't really have money to buy diagnostic tools.

This is the code I have so far, I will open actual repository once I get stuff going even more into the right direction: firmware-OLED_REWRITE_PRELIMINARY.zip

I would really appreciate some feedback, maybe measurements and diagnostics for comparison.

Also worth playing with are SCLK PWM period and pulse width parameters (Autoreload and Comparevalue)

EDIT: Rewritten OLED code can be found in my fork. https://github.com/kablek/BIGTREETECH-S42B-V1.0

swanepoeljan commented 3 years ago

@Quas7

No, it is not as the hardware SPI2 is not availabe on this type of package (only on the 64pin package)

@kablek

Unfortunately, no. OLED is not connected to SPI capable pins...

Are you guys sure about that? We have the STM32F030C8T6 (48 pin) which I think does have SPI2 possible on: PB12 : NSS PB13 : SCK PB14 : MISO PB15 : MOSI

The footnotes says "3. This feature is available on STM32F030x8 devices only." and "5. For STM32F030xC devices only." which to me means that we do in fact have it?

I think what happened here is that BTT made a mistake with routing the pins correctly and therefore they had to bitbang it in the end. They have it: PB12 : CS PB13 : DC PB14 : D1 (SDIN) PB15 : D0 (Clock)

So SPI2 SCK is routed to the display DC and SPI2 MOSI to the display clock and SPI2 MISO to the display SDIN which would obviously not work.

This means that if we use a cable between the header and the display (which seems to help anyway) then we could actually use hardware SPI2 and fix the routing to the pins on the display.

kablek commented 3 years ago

@swanepoeljan you are in fact correct, they did connect it to SPI2 port pins, but they did it very wrong.

I am not sure however, why do cables help? Added inductance/capacitance? Better continuity? Since I rewrote whole oled interface and it does somewhat help, I am not really sure HW spi would make much difference. If two completely different codes show similar issues, I am starting to think it is something with hardware? Oled driver version I created seems to work more stable, but it is interesting how adding jumper cables makes instant difference (as far to make it 100 percent stable with my version, no artefacts even with enabled motors).

Original code is just horrible. Now I have found what I'm pretty sure is overflow in current regulation code.

Also a question I have, once motor moves/gets enabled, you can't enter menu. Is this normal behaviour?

swanepoeljan commented 3 years ago

The cable helping a lot is interesting, Till once mentioned he thinks it could be the additional inductance that helps. But maybe like you mentioned we could also add some decoupling caps or add some shielding material inbetween the board and display to see if that helps. Maybe even just a series resistor to dampen the signal slightly...

Maybe you are already aware of it but one thing that caught me while doing heavy code modification was that we only really have the first 31kB of flash available. The next page is used for parameter storage and then the following 32kB used for the calibration table. As I approached the 31kB limit I often started to get strange behaviour as the parameters could override program code.

Also a question I have, once motor moves/gets enabled, you can't enter menu. Is this normal behaviour?

That's strange, I can definitely enter the menu while it's running.

kablek commented 3 years ago

Is that actually the case with code overwriting parameters? That seems like a really bad programing then....

I will do some testing, trying to get oled stable without cable from hardware standpoint. I did as much as I could, you can also take a look at my version of OLED code, it is compatible with your fork of the code (TrueStep), just remove initialization stuff from main.c file and replace oled.c/.h.

Edit: Well, did some abuse on OLED module, no luck. I will focus on rewriting other code... MAYBE rewriting everything else could stabilize OLED?

swanepoeljan commented 3 years ago

Is that actually the case with code overwriting parameters? That seems like a really bad programing then...

Jip, the first sign is normally that after flashing it want's to run the calibration again as the calibration flag gets overridden by program code. Personally I would have used the 1st or last flash memory page for parameter storage to prevent these kind of problems.

Edit: Well, did some abuse on OLED module, no luck. I will focus on rewriting other code... MAYBE rewriting everything else could stabilize OLED?

Was also thinking that maybe something else is causing it, like maybe an interrupt at some critical stage. The strange thing is that some other guys on here have some drivers which does not even last 10 seconds and others that run fine. So most likely we can only do so much from a software perspective and the rest would need to be hardware fixes/hacks.

"Unfortunately" I am one of the lucky guys and my driver works perfect (with any firmware). Have never had the OLED issue so it's hard for me to debug this issue.

kablek commented 3 years ago

Well... In that case it really calls for complete rewrite... I am probably going to modify OLED code to also support HW SPI with asumption people use wires to fix the pinout. I believe tho we first require rewrite of other things. There is just too much wrong with original code, and it is really badly written. There are some good concepts in the code, but execution is terrible.

I believe it would be in our best interest to join "our forces" and start rewritting as much as we can.

Another thing I found is, that in function "Output" that sets PWM outputs, there is potential for effort values above 32 to overfloow? I have proposed solution but I haven't yet tested anything. Also I don't understand the use of float for phase multiplier, wouldn't it just be better to multiply by 25 and then divide by 2 for 12.5 phase multiplier? I still have to test said fixes.

But I suggest scrapping most of original code. I think it is not really worth fixing in the shape it is.

EDIT: Expect new version of OLED code today (in aprox. 5-6 hours from now) in my fork of the repo, compatible with TrueStep fork. I have commited changes but forgot to push them.

swanepoeljan commented 3 years ago

I believe it would be in our best interest to join "our forces" and start rewritting as much as we can.

Sounds good, I think we have some good people on here, each with their own set of skills making very positive contributions (in both software and hardware). Like you said, would be nice at some point to combine all this into a brand new project. Maybe even with support for different boards using a Configuration header file (like Marlin for example) to enable/disable features you want, depending on your hardware.

Another thing I found is, that in function "Output" that sets PWM outputs, there is potential for effort values above 32 to overfloow?

You mean due to the order in which it is evaluated? it would be better to write:
v_coil_A = effort * (sin_coil_A / 1024);

instead of: v_coil_A = effort * sin_coil_A / 1024;

The interesting thing is that some time ago I did measure the voltage on one of the VRef pins with effort value of 80 which gave an output very close to 1V which suggests that it works. If I have to take a guess, I think due to the 1024 constant all variables are promoted to 32bit integers which makes it work. Or do you see some other issue there?

Also I don't understand the use of float for phase multiplier, wouldn't it just be better to multiply by 25 and then divide by 2 for 12.5 phase multiplier?

This one is also unclear to me, why it's even there in the first place...

EDIT: Expect new version of OLED code today (in aprox. 5-6 hours from now) in my fork of the repo, compatible with TrueStep fork. I have commited changes but forgot to push them.

Nice, thanks! Will give it a try.

kablek commented 3 years ago

Well I don't know what exactly is going on there. Promoting variable to 32bit might be the case, but it doesn't make much sense with all other being 16bit integer value. As for the order of evaluation goes: v_coil_A = effort * (sin_coil_A / 1024); Is a bad idea, because sin value being divided first would resoult it being 1 or 0 depending on rounding operation. But this makes me wonder, what did you measure VRef with? oscilloscope or meter? was it actuall sine wave representation of motor phase or was it square wave of 1V? if it is square wave of 1V, then it might be the case of program actually doing division first and then multiplication (Compiler optimizations work in misterious ways). We do want multiplication first, but we don't want number to overflow. I will se if there is option to look at code disassembly and compare it to original source.

EDIT: Acording to internet in case of v_coil_A = effort * sin_coil_A / 1024; Operations should be performed left to right. Effort and sin should be calculated first, which is desired and should produce correct sine waveform (not square like division first would produce). How ever, when peak value of sin is 1024, everything above 32 written in effort should produce overflow in case of calculation being 16bit integral. It might be the case for some reason calculation gets cast to 32bit integer, but it doesn't make much sense to me. I believe literals are having variable bit length depending on calculation being used in, so all the values in calculation should be handled as 16bit. I might be wrong and I am doing research on the subject.

About phase multiplier I am also pretty clueless, it probably has to do with different formats of angle being written? but with code being written in ugly way it is, this will be hard to figure out. It should definitely be faster to multiply and divide with whole numbers than to just multiply with floats

Another note to my OLED code: the fact it is interrupt driven, making it non blocking makes key press detection much faster, double clicks are detected since there is no debounce/delay.

swanepoeljan commented 3 years ago

Ah ja true, I didn't think that one through properly, we definitely want to start from the left.

For the measurement I just used my meter but can redo it with my scope to get a better picture.

Another note to my OLED code: the fact it is interrupt driven, making it non blocking makes key press detection much faster, double clicks are detected since there is no debounce/delay.

Had a similar thing when I rewrote the OLED menu, they use delays in the original code to do the debouncing which I didn't like and it slowed the main loop down a lot. In the end I used the SysTick timer to scan for key presses at a lower rate which works okay.

kablek commented 3 years ago

Alright, I finally did push new version of OLED code to my fork. I cleaned it up quite a bit and did some planning for further changes to OLED driver code. I believe code should work fine for now and should be stable enough for use. In the future I will add support for hardware SPI over fixed pinout, but that is currently not top priority.

Button handling could also be handled by interrupts I believe, but that does coma at a price of potentially slowing things down due to interrupt calls. Buttons are also not top priority IMO since their implementation is not really difficult. I believe area to focus on first is getting all important interfaces working correctly, if possible without anything happening in main loop? I will probably look at code for magnetic encoder and bridge control (TIM3 PWM) next.

I would be very happy of comparison between original code waveform and my new version of code waveform, so if you are willing to test it out I would be quite happy.

EDIT: There is definitely A LOT more to magnetic encoder than is in the code. That section will definitely benefit a rewrite. I am also starting to believe C++ is the way to go since just C might become limiting.

kablek commented 3 years ago

I believe I figured out where 12.5 phase multiplier might come from: 1 full rotation of the motor is 360° 1 full step is 1.8° 360/1.8= 200 -> represents number of steps in one full rotation

1/32 step mode sets stepangle variable to value of 2.

Then we have this piece of code:

s = LL_TIM_GetCounter(TIM1);
        if(s - s_1 < -32768)
          s_sum += stepangle * 65536;
        else if(s - s_1 > 32768)
          s_sum -= stepangle * 65536;

        r = s_sum + stepangle * s;
        s_1 = s;

        if(r == r_1)
        {
          hccount++;
          if(hccount>=1000)   //Delay to decrese current when stationary (after 1000 cycles idle it drops current to half)
            hccount=1000;
        }
        else
          hccount=0;

        if(hccount >= 1000)
          Output(r,UMAXOP/2);
        else
          Output(r,UMAXOP);
        r_1 = r;

Now, hccount is simply for current reduction after some time on idle and has nothing to do with angle. r is what is passed forward to Output function to be then multiplied with 12.5. r_1 is simply previous value of output angle and it is used for delay management together with hccount, so it doesn't do anything with angle calculation. s is data from the TIM1 or counter that is counting step pulses, It seems like value saved in TIM1 register is not getting clear, and acts like absolute steps in some direction counter. s_1 is just previous value of s s_sum is I believe used for some averaging function? maybe overflow detection? I have no clue but we can assume it could be value of zero before if increase/decrease it. That means we can put it away.

that would simplify code to:

s = LL_TIM_GetCounter(TIM1);
r = stepangle * s;
Output(r,UMAXOP);

Now if on TIM1 pin we need 32 pulses (which get later multiplied by 2) for 1 step and 200 steps for one full rotation: 64 200 =12800 -> number of 1/64 usteps in one full rotation, this number divided with 12.5 results in 1024, which is very suspiciously 1024, which is quite a nice number considering 10244=4096 which is number of elements in the sine table? now this last bit kind of confuses me, but I guess reason is for steps from step input to be scaled to theta table.

swanepoeljan commented 3 years ago

Could be that you are on to something here...

Last time I tried to figure it out I was looking at the OneStep() function. This one is used during the calibration and should move 1.8° per step. Here they used:

Output(81.92f*stepnumber,80);

When applying the 12.5 multiplier we get 12.5 * 81.82 = 1024 which kind of makes sense since for full steps we would read the sine table in 90° (1024) intervals. What throws me off though is if that is the case then why when we use 1/2 steps (stepangle = 32) and make 2 steps we pass 64 to the same Output() function and also expect to move 1.8°?

Quas7 commented 3 years ago

Well... In that case it really calls for complete rewrite... I am probably going to modify OLED code to also support HW SPI with asumption people use wires to fix the pinout. I believe tho we first require rewrite of other things. There is just too much wrong with original code, and it is really badly written. There are some good concepts in the code, but execution is terrible.

I believe it would be in our best interest to join "our forces" and start rewritting as much as we can.

Another thing I found is, that in function "Output" that sets PWM outputs, there is potential for effort values above 32 to overfloow? I have proposed solution but I haven't yet tested anything. Also I don't understand the use of float for phase multiplier, wouldn't it just be better to multiply by 25 and then divide by 2 for 12.5 phase multiplier? I still have to test said fixes.

But I suggest scrapping most of original code. I think it is not really worth fixing in the shape it is.

EDIT: Expect new version of OLED code today (in aprox. 5-6 hours from now) in my fork of the repo, compatible with TrueStep fork. I have commited changes but forgot to push them.

For the OLED, it is maybe more robust to switch to a bitbanging I2C implementation instead of the 2-wire SPI. The bandwidth is not required for any animation/graphics. And the register-shifting used in SPI might be the root-cause as a too fast clock edge cycle (faster than in my captured glitch) propably kicks the display out of sync.

kablek commented 3 years ago

Well it might be worth a try with I2C, that does require slight hardware modification of OLED board. However, I think it would be even worse with I2C, since it is much more subject to glitches from my experience. It might be worth a try to lower GPIO speed setting a bit, since that alters rise speed of edge. I am more and more starting to believe there is something else going on with whole board/MCU that is causing glitches. I will try what happens if I load fresh code just for OLED, with everything else being turned off at some point.

swanepoeljan commented 3 years ago

But this makes me wonder, what did you measure VRef with? oscilloscope or meter? was it actuall sine wave representation of motor phase or was it square wave of 1V?

Quickly tried again but this time with the scope. Placed the driver in open loop mode and measured both VRef pins:

So at least they look like sine waves and the level also seems good.

Output(r,UMAXOP);

UMAXOP is defined as 160 which translate to roughly (3.3V / 256) * 160 = 2.0625V

kablek commented 3 years ago

That is some great news! It does look reasonably clean waveform to! I wonder what effect dropping PWM frequency has. Does anyone have any info how good V2 version of S42B work? did they fix bugs? They did use better MCU and added CAN protocol. Firmware seems to be in different repo, and just... different.

Should I open new repository where we start writing fresh code? if @Quas7 and @swanepoeljan are interested in joining forces, we could have better fresh code quite fast.

Quas7 commented 3 years ago

I am not sure, if we should put too much effort in v1 boards before we know what v2 brings. In my experience, old revisions will just vanish form the supply chain. And there are no layout/gerber files provided so far to work on hardware level on v1 either.

I would buy a v2 as soon as I see it somewhere but currently I only find v1.1 (pictures still show STM32F030 and same ICs as v1). @kablek Can you link the v2 repo or more infos here? Did not find this either.

I would rate myself more as a hardware/sensor guy and far less capable on the software side. Not sure, if you would rate my code as good because I "fix" things most of the time but do not write from scratch. What I can provide to our pool would be testing/debugging and a few diagnostics and validation setups.

I also think, that Jans TrueStep fork is already providing some very interessting features and he has done already some rewriting. Not sure, if he likes to get support in this early phase though as this would bind quite some time for alignment in the beginning.

For the product itself, I see the usability and compatilibilty of the magnetic encoder approach in such highly magnetic noise environment as a general issue. Providing more diagnostic possibilities for the average hacker/maker would give this device more tracktion and applicaitons. I mean self-diagnostic, calibration-check like MKS Servo42, PID-autotuning in system, etc.

On the other hand, for a bit more money (like 2x) you already get the Nema17 industrial grade optically-closed-loop motors (but closed source...).

EDIT: https://github.com/bigtreetech/BIGTREETECH-S42B-V1.0/issues/16#issuecomment-725560325 I think, I found a clue why some boards are much worse than others. The buck converter has at least a much smaller inductance for one of the bad performing boards (both seem to have v1.0 stated - needs confirmation).

swanepoeljan commented 3 years ago

I would buy a v2 as soon as I see it somewhere but currently I only find v1.1

Didn't even know there was a v1.1 out. Interestingly they did move the switching regulator away and looks like they added an additional AMS1117 3.3V regulator, maybe as a post regulator to clean the supply up a bit? I guess it works with the same firmware as v1.

CAN bus support is really something I would like to have, maybe it's not so popular for printers yet but for other robotics and industrial uses it really makes sense. I have some experience with CAN (UAVCAN) on drones and even build some modules in the past. On the side I have actually also started drafting a board based on the S42B but with the beefier STM32F303 (has DAC so no more PWM on VRef) and CAN driver hardware. Also has an FPU which could come in handy... Will let you know if it works out.

Should I open new repository where we start writing fresh code? if @Quas7 and @swanepoeljan are interested in joining forces, we could have better fresh code quite fast.

Sounds good to me. I think if we can keep it highly configurable so it's easy to port to other boards it would be great. Like @kablek also mentioned before, maybe this time around it makes more sense to use C++. It would just allow faster and cleaner development and I don't think there would really be a performance hit.

I also think, that Jans TrueStep fork is already providing some very interessting features and he has done already some rewriting.

We could also port some of these features to the new project, maybe it helps to get a first version up and running quicker.

On the other hand, for a bit more money (like 2x) you already get the Nema17 industrial grade optically-closed-loop motors

Sounds interesting, do you have more info on these?

Quas7 commented 3 years ago

@swanepoeljan I think, I was mistaken on the optical feedback nema17 motors as I mixed this with the original nanostepper price tag (~40€). But this one is optical hybrid for less than 100€ https://www.upload.sorotec.de/doku/manuals/DS_iHSS42en_190426_soro.pdf

The Trinamic products are likely very sound magnetic solution as they are the industrial/medical solutions: https://www.trinamic.com/products/drives/details/pd42-x-1140/ Of course, higher end, higher price.

But much cheaper you get one of these: https://cdn-reichelt.de/documents/datenblatt/X200/ACT_17HS44172802_DB-EN.pdf Motor included for around 30€ so not too far away the S42B kit.

As soon as you add CAN to the requirements you are in another league. ;)

kablek commented 3 years ago

I am not sure, if we should put too much effort in v1 boards before we know what v2 brings.

I would buy a v2 as soon as I see it somewhere but currently I only find v1.1 (pictures still show STM32F030 and same ICs as v1). @kablek Can you link the v2 repo or more infos here? Did not find this either.

I present to you the BIQU online shop link to V2 boards: https://www.biqu.equipment/products/bigtreetech-s42b-v1-0-closed-loop-driver-control-board-42-stepper-motor-oled-3d-printer-parts

As soon as you add CAN to the requirements you are in another league. ;)

Wellllll, as you can see, V2.0 does support CAN protocol! Making universal firmware and make drivers CAN control opens up whole new options! And for boards that don't support can, we can implement some UART/SPI/I2C communication with host system.

And I even more proud present to you this repository: https://github.com/bigtreetech/BIGTREETECH-Stepper-Motor-Driver There you'll find some interesting folders, namely S42A -Which states code and schematics by: https://github.com/jcchurch13/Mechaduino-Firmware and: https://github.com/Misfittech/nano_stepper Readme states S42 series is inspired by those two projects and share a lot of parts.

Next interesting folder is S42B, Which looks very similar to this repository we are talking in. but, there are sub folders V1.0 and V2.0 -> Those are actual different versions we are talking about. Sub folder firmware also contains separate sub sub folders with same names and purpose of different board versions.

Now last interesting folder is S57B. Which doesn't contain firmware, but it does contain info on what I believe from pin-map prints a scaled up version of S42B to Nema57 format stepper motor -> bigger more suitable for CNC. This with properly written firmware, with comunication to printer main board over select-ably CAN/UART/SPI/I2C does sound tempting..

EDIT:

16 (comment)

I think, I found a clue why some boards are much worse than others. The buck converter has at least a much smaller inductance for one of the bad performing boards (both seem to have v1.0 stated - needs confirmation).

That is very interesting! is there potential hardware fix for bad boards? Can you do oscilloscope comparison of measurments on power rail, with AC coupling and high gain between good and bad boards? To compare voltage ripple on output of switch converter? I see some potential in this info.

Didn't even know there was a v1.1 out. Interestingly they did move the switching regulator away and looks like they added an additional AMS1117 3.3V regulator, maybe as a post regulator to clean the supply up a bit? I guess it works with the same firmware as v1.

Maybe? I would love to see BTT release their motor driver boards schematic. This would end big heap of guessing game and would really aid development of amazing community made firmware for their boards. I understand releasing schematic could result in other factories to copy their design. But, I don't believe they are worth cloning in this state of operation. Creating something more out of this firmware could make those boards desirable to clone ->but also to sell more original boards. So, just an idea to @bigtreetech to release S42 series schematics, maybe even board prints.

Quas7 commented 3 years ago

Ok, so I would conclude that BTT sees quite market potential in going forward with this product family. Maybe, they invest also in a senior programming team to get something more solid than the rush to market code of S42B. At least I expect they can also provide decent code as they also commit to Marlin code.

Regarding the different inductances I checked the two boards I have currently available here and both have 220uH and the plots for that I posted in the OLED thread. I can check a third board maybe this weekend at another place but so far I measured nothing out of the ordinary on the supply rails with 220uH. But, with 6.8uH and core saturation this can end in issues or it just runs on a higher operating frequency that also might impact the OLED again.

I had still no luck to find the datasheet that covering the step down converter "BN0R" or "BN06" (8 pin).

They rearranged the layout quite a bit. STM32 close to the OLED pins as well.

kablek commented 3 years ago

I believe 06 and 0R are just date codes? What I find weird is that IC is so close to OLED. Is it just powering the oled and nothing else? Why is it not placed closer to actual power input? I tryied to find what kind of switching regulator it is exactly, no luck. It seems they did move voltage regulator closer to input. We are finally getting on to something!

swanepoeljan commented 3 years ago

I present to you the BIQU online shop link to V2 boards:

Awesome! I just placed an order for one! Hopefully it arrives in the next 2 to 3 weeks. Also briefly looked at the V2 firmware, maybe slightly better structured than the previous version but still feels very experimental. At least they have some basic CAN code in there also so seems like the hardware was tested.

Looks like this time they used the "cmsis" framework where previously it was "stm32cube". This would also be something we would need to decide on when creating new firmware. I guess if we only care about stm32 based boards (which is fine by me) then cube would be the way but if we want to support other boards then cmsis would be better?

I had still no luck to find the datasheet that covering the step down converter "BN0R" or "BN06" (8 pin)

Same here, also couldn't find anything. Looks like a SOT23-6 package so also added that to my search. I guess they would reuse these regulators in other products as well, so maybe we could try to find circuit diagrams for their other products and have a look there. They would also probably use a Chinese supplier like lcsc.com which might have devices which is hard to find in the rest of the world, so might be worth seeing what they have...

Is it just powering the oled and nothing else? Why is it not placed closer to actual power input?

Well and also the MCU and TLE5012 off course, I think because of the opto-couplers they probably didn't have any board space available close to the connectors anymore. The middle would probably also be a bad idea as it would then be directly above the encoder so next best thing was probably where it is now. In V2 I think the first switching regulator provides the 5V needed by the CAN transceiver and then the AMS1117 gives 3.3V to the rest of the system again.

kablek commented 3 years ago

Looks like this time they used the "cmsis" framework where previously it was "stm32cube".

I believe that is incorrect. I believe both sources were generated by STM32CubeMX software, which is great tool I also use (I have reverse engineered hardware of 1.0 into CubeMX software, it just makes everything sooo much easyer.) They did use a bit different settings to generate code into multiple files instead of one. There are some slight differences between folder structure, but it should be pretty interchangeable between two MCUs. CMSIS is just standard for writing low level cortex core code, that is actually being used in CubeMX generated code.

I guess if we only care about stm32 based boards

Well it is also fine by me, but how can we be sure they will not expand into other MCUs? it seems a bit unlikely, I would start with STM32 in focus, but I believe it is good idea to keep cross-compatibility of code in mind when writing it.

They would also probably use a Chinese supplier like lcsc.com which might have devices which is hard to find in the rest of the world, so might be worth seeing what they have...

That is VERY true. It is pretty urgent we figure out what the hell is going on with those Switching supplies. Marking of 220uH vs. 6.8uH between different boards is very suspicious. It is order of magnitude of difference, do boards with 220uH work at different switching frequency than 6.8uH? that doesn't make much sense with same chip being used on both boards, we have to investigate this first and figure out if it is actually power supply issue.

In V2 I think the first switching regulator provides the 5V needed by the CAN transceiver and then the AMS1117 gives 3.3V to the rest of the system again

Agree, we do need confirmation on that tho. Using switcher to regulate 5V an then step down with 1117 series regulator is great practice, since linear regulators like 1117 really do produce clean output.

kablek commented 3 years ago

In my fork of this code I have added folder with fresh blank code doing only OLED stuff. Despite removing everything else, so MCU only does OLED pretty much, there are glitches and artifacts on screen. My board has 220 labeled inductor and BN06 labeled chip. I see there have been new findings on DC/DC chip. Mayhaps I shall do some hacking up to see if external clean power source improves things? Maybe we are chasing red herring, maybe there is something about power supply.

arrowcircle commented 3 years ago

@kablek Hey! If you going with full rewrite for v2 with can bus and other interfaces it looks like a good idea to go with RTOS. There are a lot of bad options, like freertos bundled with cube, but not of them allow to make reliable system, that will work with high-load buses. Next thing is that STMCube HAL is horrible, but there are other HAL options. Some RTOSes do have own hal implementation (like ChibiOS). There are also commercial solutions, but not of them available for free for open source projects. Another thing to consider is build system. Platformio looks like easy solution for everyone, but it does not support most of viable options. Also, do you consider switching to c++ to easily isolate code and adding tests?

What do you think about this problems?

swanepoeljan commented 3 years ago

@arrowcircle You made some interesting remarks. To better understand the issues and solutions you raised could you please give some context to your statements. These are topics I often wonder about and would be interesting to also hear your reasoning.

looks like a good idea to go with RTOS

What advantages would an RTOS bring in this case? Would it not be an overkill (eating up precious cycles, SRAM and FLASH) for such a small and focused application?

STMCube HAL is horrible

In which way do you mean?

Platformio looks like easy solution for everyone, but it does not support most of viable options.

Which options are you referring to?

arrowcircle commented 3 years ago

@swanepoeljan

What advantages would an RTOS bring in this case? Would it not be an overkill (eating up precious cycles, SRAM and FLASH) for such a small and focused application?

RTOS will help managing priorities managing display, doing work with driver, receiving commands from STEP DIR / CAN bus. Super loop with ISRs is a mess. Using RTOS will make code much cleaner and separated from each other. It will eat resources, but I don't think this will be a problem for f103 chip.

In which way do you mean?

It's inconsistent. Current version from ST is not compatible with current version from platformio (2 or 3 years old). There were times docs and examples were out of sync, so examples from docs does not work. If you look into embedded-hal from rust you will see why HAL from ST is not so good. And no way they do have something good with such bad generated code from CubeMX, that forces to use comments to markup places where to put the code in the file.

Which options are you referring to?

For f103 platformio does support these frameworks:

Arduino - not an option in any way, does not support RTOS
mbedOS - does have RTOS, but little outdated and CAN bus support is buggy + nobody cares about that. It starts to miss messages even on 10 messages/sec. That's unacceptable.
CMSIS - Very low level and need integration with RTOS. I dont think this will help making clean solution
CubeMX - Outdated version of ST libs and HAL. Need to integrate RTOS manually (easy thing, there are a lot of examples on github)
opencm3 - Opensource hal for cortex cores, does not have RTOS, needs manual integration. Didn't test this thing, but looks promising.
Zephyr RTOS - it's more about IoT, not industrial automation or something. I think it does not fit because of IoT focus.

None of these frameworks have perfect fit. Only great thing is that they are supported by platformio, and this means easy development and customization. But if you look around and leave platformio there are a lot of variants to use.

Rust. Support of can bus for F1 is almost merged into master branch. I tested it and it works fine. There is RTIC "RTOS" and there are production projects based on these platform, but it's highly experimental. It's easy to setup and have super tooling and environment comparing to anything else.
ChibiOS/RT or NIL - open source RTOS, super reliable, used in PX4 autopilot software, does have own HAL, does support CAN. Build on windows is possible with chibiStudio, but on other systems make work fine.
Micrium uCOS 2/3 was recently open sources by SiliconLabs, it's mature and certified RTOS, but not everything is open sourced. There are a lot of examples for different boards available on Micrium website after registration.

There are a lot of other options available, but I showed 3 principally different things: experimental rust, OSS ChibiOS and enterprise uCOS.

Quas7 commented 3 years ago

Maybe we really first check that the actually implemented F030 with only an M0-core @48MHz is RTOS capable as it is unfortunately not an F130 with an M3 core. :) Any known examples for mechanical PIDs on F030 with an RTOS?

kablek commented 3 years ago

it looks like a good idea to go with RTOS.

Weeeellll I did think about RTOS multiple times but I believe it is not as good idea as it sounds. I did work with mbedOS and FreeRTOS and another one I forgot name of... I used it on far more powerful platform, with lower latency needs than we are having here.

like freertos bundled with cube

Right I see you mention FreeRTOS as bundled with cube and being bad. TBH in my experience FreeRTOS is not bad, it was in fact the one I used the most. As for cube being bad, I don't think intention with cube is to use it to actually generate code, modify and regenerate etc. Yes, it happens few times in the early code development, it is tool for early code generation, not full development tool imo.

STMCube HAL is horrible

Yes, but we are currently not using it, we are using low level drivers, they are doing updates on cube and HAL. HAL is not as efficient as LL drivers, actual implementation of this is a bit questionable still.

RTOS will help managing priorities managing display, doing work with driver, receiving commands from STEP DIR / CAN bus. Super loop with ISRs is a mess. Using RTOS will make code much cleaner and separated from each other. It will eat resources, but I don't think this will be a problem for f103 chip.

Yes, RTOS does bring that to table, along with performance hit, complexity with variable access, thread priorities can lead to bigger mess in the end.

Platformio looks like easy solution for everyone, but it does not support most of viable options.

Agree on that, however, it does bring to the table compatibility, ease of use for pretty much any hardware platform (beyond STM32) and it is currently used. I am not sure fixing what ain't broken is the way to go. Software vise it doesn't limit us, it does limit debug capabilities tho.

Also, do you consider switching to c++ to easily isolate code and adding tests?

I am considering C++ a lot! Object oriented goes along really well with portability and code readability.

Maybe we really first check that the actually implemented F030 with only an M0-core @48MHz is RTOS capable as it is unfortunately not an F130 with an M3 core.

M0 is capable of RTOS, but I don't think it is powerful enough that we are willing to risk performance for it. After all, any latency due to software could result in drivers performance while printing. I believe performance should be top priority, because having steppers operate in closed loop but not as good as they should seems worse than just using open loop setup with trinamic drivers.

As for CAN, it does not yet bring too much to the table. Most 3D printer motherboards don't really support it, most of 3D printers firmwares have limited support I believe. Note that there are also UART and I2C (with no connector, just two accessible pads) on V1.0 board. It would be good to implement it, but I believe basic functionality is more of a priority in this moment.

I can't tell much about V2.0 hardware, since I haven't looked into it and I do not have V2 board. It would be nice to see BTT have some interest in better firmware for all their drivers also and releasing some more data and actually support open source development for THEIR hardware.

arrowcircle commented 3 years ago

@kablek

As for CAN, it does not yet bring too much to the table. Most 3D printer motherboards don't really support it, most of 3D printers firmwares have limited support I believe. Note that there are also UART and I2C (with no connector, just two accessible pads) on V1.0 board. It would be good to implement it, but I believe basic functionality is more of a priority in this moment.

Closed loop steppers could be used not only in 3d printers, but in industrial automation. And can bus is one of the popular and cheap buses (compared to industrial ethernets). Also, duet 3 does support CAN bus in 3d printer world.

If code will be reusable, then using it in both m0 and m3 cores will be easy. But adding different input sources will make it painful without RTOS. And of course it's about sanity and tests. Maybe performance loose caused by RTOS will be invisible, but will allow more clean architecture.

Quas7 commented 3 years ago

above it is stated that one could achieve superior print results. I understand better reliability but I am totally unaware of any quality improvment open vs closed in 3d printing. I understand the benefits for closed loop in milling/routing cnc applications with heavy load changes but where does this happen in 3d printing? 10k mm/sec^2 acceleration maybe?

dzid26 commented 3 years ago

@kablek I am not sure if that something you gouys would be looking at, but the software from MKS is quite decent. Now that processors match between mks and V2.00 (stm32103) it would be easy to make it work. It is based patially on low level registeres and on Standard Peripheral Library which is not the same what CubeMX generates nowadays. I have created a branch for MKS42B that handles CAN. I am definetly interested in CAN for robotics. Currenlty MKS uses 5K of global RAM (25% of F103) and F03 had 8K so it should be enough.

While it would be easier to use mks code, I think starting from scratch would be probably better long term.

I was also thinking of FreeRTOS too but not sure if there would be much benefit. For control loop it would need use interrupts anyway, as normal tasks are >1ms. I do think it would add cleanness to the code though.

As for IDE, I am using platformio with STM32 right now and I am keep thinking it is a big mess without proper support for latest HAL.

I would say STM32Studio would be more professional.
Or pure makefile/openocd(GDB) - more flexible (can work with VSCode too). Perhaps a combo of the two above depending on someone workflow.

kablek commented 3 years ago

I will try and see what MKS code has to offer. I agree CAN does give a lot of options and board is not only for 3D printing, but 3D printing is kind of main targeted market. as benefits for quality of printing goes... yeah agree, we don't really know how much it brings to table if much (there is probably more benefit to using better open loop drivers than there is to close loop). self PID calibration could have benefits of reducing artifacts and produce smoother movement, assuming we can do it in real time with great timing accuracy.

I will take a look at MKS code for sure.

bigtreetech / BIGTREETECH-S42B-V1.0

Reverse engineer and firmware rewrite #17

16 (comment)