motor does not reach max speed like old Gen2.x binary

RoboDurden commented 1 year ago

continued from https://github.com/Candas1/Split_Hoverboard_SimpleFOC/issues/4#issuecomment-1604941688

In the past, I had problems with SimpleFOC due to the limited speed of the encoder readings. The Arduino interrupt handling code has a lot of overhead, made worse by the various processor layers. If the code handling the hall sensors is not fast enough, speed is limited by that. And we are using a 48MHz processor, pretty slow all considered.

I don't have time today, though. Over the weekend, I'll look at ways to speed up the interrupt handler and see if that improves anything.

I might be easier to slow down the hall sensor reading artifically and see if that has an effect on the motor max speed :-)

Candas1 commented 1 year ago

I am still not sure if the sensor alignment is good.

One way to check this is described here

Do you see/hear differences in how the motor spins in each direction with both binaries? A wrong offset could lead to phase advance in one direction, and phase delay in the other direction.

RoboDurden commented 1 year ago

you can hear in the video for yourself but no the sound is the same. The current however is about 540 mA spinning "left" and about 580mA spinning "right". So there indeed is some asymmetry

Good night ! i am tired now and i still want to finish the rewatch of https://en.wikipedia.org/wiki/The_Hunt_(2012_film) Good dvd...

robcazzaro commented 1 year ago

@RoboDurden can you please do a quick test

Add the following to the plaformio.ini file

build_unflags = -Os
build_flags = -Ofast

It compiles the binary with the highest level of optimizations possible, and if my theory is correct (slow interrupt), it should speed up the motor by a small but measurable amount.

Meanwhile I'm looking at the interrupt code

RoboDurden commented 1 year ago

@robcazzaro , I would rather do that tomorrow when I have arrived at my solar warehouse. I have already packed the test setup in a box. And my notebook is occupied with 3d printing..

RoboDurden commented 1 year ago

If the effect is small i will need UART rpm logging and I have yet to setup that again.

robcazzaro commented 1 year ago

@robcazzaro , I would rather do that tomorrow when I have arrived at my solar warehouse. I have already packed the test setup in a box. And my notebook is occupied with 3d printing..

Of course, I didn't mean "right now" 😉 whenever you have time, no hurry

I saw a few Android apps that can measure RPMs if you stick a white tape to the wheel, maybe that is easier than enabling UART. I mean, UART is good to have, but an RPM meter app might be a quick way to see if there are differences

robcazzaro commented 1 year ago

@RoboDurden I understand the interrupt handling code pretty well now, and there are a few hacks that would speed things up. But it requires a few changes to a few files and possibly breaking compatibility with the official GD32 core.

If, whenever you have time, you can confirm that compiling with -Ofast (the platformio.ini changes above) makes a difference, I will send you the modified files. Otherwise there's no point in having you waste time with the changes.

RoboDurden commented 1 year ago

Okay that were my changes:

;build_flags = -D __PIO_DONT_SET_CLOCK_SOURCE__
;   -D __SYSTEM_CLOCK_48M_PLL_IRC8M_DIV2=48000000
build_unflags = -Os
build_flags = -Ofast

I did not hear a higher speed and android apps do not use camera but inernal IMU and you have to put your phone on the turntable..

But current was sligtly less. before 590mA and 550mA (left/right) and with your requested changes: 550mA and 500mA

So i recorded the audio: www.robosoft.de/forums/rpmsound.mp3 first you hear simpleFOC-old left/right, than your new ini file left/right, and last the gen2.x left/right And i made a fft analysis from the constant part of all three left(?) sounds: rpmsounds

So yes, your requested ini changes result in a 476 Hz / 456 Hz = 4.3% speed increase, but the gen2.x still has another 547 / 476 = 15% rpm increase on top of your changes.

This was a long sunday for me. Got 3.5 kW old solar panles for free: IMG_20230625_203115_copy_2024x1518 IMG_20230625_204106_copy_1518x2024

Good night and good luck :-)

robcazzaro commented 1 year ago

That is really good news @RoboDurden. We can confirm it's a IRQ overhead issue. I will send you a couple of file to try before my end of day (overwriting the existing one for now). Thanks for doing this, and clever use of the audio, well done!

Just FYI: the GD32 Arduino core has some weird behavior with the interrupts, I opened an issue in their repository to be sure https://github.com/CommunityGD32Cores/ArduinoCore-GD32/issues/105.

TLDR: The GD32 interrupt handler (at a hardware level) groups GPIO 0-1, GPIO 2-3 and GPIO4-15 as separate interrupt source. So the EXTI01 handler must check if GPIO0 or 1 fired. Same for EXTI23. The problem is that any interrupt for pins 4 to 15 fires only one interrupt, and the handler must scan all 12 possible sources to see which one fired. So the HALL sensor on PB11 takes a long time, and the one on PC14 even more. The easy hack is to only check for pin=11 or pin=14 in the EXTI4-15 handler (which makes the code much, much faster), but I wanted to fully understand the GD32 logic for multiple ports and pins before messing with it.

Also opened https://github.com/CommunityGD32Cores/ArduinoCore-GD32/issues/106 to see if we can speed up the IRQ handler without having to change the GD32 Arduino files

Bottom line: as long as we know what interrupts we need for GPIO pins change, we can easily speed things up, and I have working code. Just want to finish testing it

robcazzaro commented 1 year ago

Ok, here are 3 files to replace whenever you have time @RoboDurden

C:\Users[yourname].platformio\packages\framework-arduinogd32\cores\arduino\gd32\gpio_interrupt.c [project_dir].pio\libdeps\GD32F130C8\Simple FOC\src\sensors\HallSensor.h [project_dir]..pio\libdeps\GD32F130C8\Simple FOC\src\sensors\HallSensor.cpp

Please note that one is in the platformio GD32 platform directory, the other 2 in the project directory. And please check that the changes are compiled. Incidentally, the VS Code UI sometimes gets confused by all the nested defines, so in the gpio_interrupt.c file, the portion following `#elif defined(GD32F3x0) || defined(GD32F1x0)" is shown as disabled (greyed out), but it's active and gets compiled (GD32F1x0 is defined). It's a VS code UI bug. That one is the file I hope not to have to change thanks to this

Please also keep the -Ofast optimization enabled for now. It will make debugging harder, but a 4% gain is non-trivial. There are further optimizations possible in the Hall sensor code (updateState()), but those require more thinking. The ones I did for now are pretty safe.

If anything is unclear, just ask. I tested everything and it seems to work as expected, but it would not be the first time I screw up 😊

hall_IRQ.zip

robcazzaro commented 1 year ago

BTW: I have been looking at other ways to potentially speed things up, but would require changes here and there. If the interrupt enhancements are not enough, I have a few more things to try. But the GD32 at 48MHz is slower than most other processors recommended for SimpleFOC, so we might have to hand-optimize more. To be fair, the FOC algorithms are more compute-heavy than the type of control implemented by most hoverboards

Candas1 commented 1 year ago

Don't be pessimistic, I am sure we will get there. We are not doing foc yet, but I assume that the boards with current sensing had foc implemented.

We for sure will have to optimize because of the arduino-gd32 and simplefoc overhead.

RoboDurden commented 1 year ago

no, the new three file do not make a difference: rpmsounds2

The gen2.x also has a speed difference with left/right, but only 0.7%. the simpleFOC has 1.5%.

I think the gen2.x has no interrupts for the three hall sensors at all but simply reads the state in the bldc.c 16 kHz CalculateBLDC():

    // Read hall sensors
    hall_a = gpio_input_bit_get(HALL_A_PORT, HALL_A_PIN);
    hall_b = gpio_input_bit_get(HALL_B_PORT, HALL_B_PIN);
    hall_c = gpio_input_bit_get(HALL_C_PORT, HALL_C_PIN);

So i do not really think that the three hardware interrupts from HallSensor class take longer or are less accurate ?

The EFeru FOC is faster than the Gen2.x anyway and with field weakening it can get even faster (20%). So maybe we should procceed with current sensing to get simpleFOC foc working and then have new options to tune speed and power consumption.

Candas1 commented 1 year ago

Foc is actually slower than trapezoidal or sinepwm, even with EFeru's firmware.

robcazzaro commented 1 year ago

Bummer. And thanks @RoboDurden for your additional tests.

Clearly, even if the interrupt code was inefficient, it did not consume enough cycles to be a meaningful difference. Was a low hanging fruit, but not the right one.

There might be other opportunities (for example the sensor.updateState() function is over complex and calls bloated functions (like micros() that is needlessly complex, would be easier to re-implement using DWT_CYCCNT and knowing that the core clock is 48MHz). @RoboDurden if you try following the hall sensor code, you will see that a simple 3-line GPIO read in the gen2.x software is more than 50 lines, called each hall interrupt (open and close). SimpleFOC abstraction makes things more complicated (SimpleFOC can for example use SPI encoders for angle control, and the same code works with hall sensors)

As for the speed difference between boards, that is 0.7%. We are running the GD32 using the internal oscillator, which has a 1% accuracy, so that could explain the small difference you are seeing.

The next step unfortunately it's profiling the code. The best way is to use SWD/SWO, but if you have a cheap STLink Chinese clone, the SWO pin is not enabled. It can be enabled with https://lujji.github.io/blog/stlink-clone-trace/ and it's well worth it, but time consuming and you need to be good at soldering on the small STM32 pins. With that enabled, you can ten use the STM32CubeIDE from ST to run the SWV Statistical Profiling (it's under Window, Show View, Other Capture

That would get you a complete profiling profile, but might be too slow over SWO and pretty hard to get running.

Otherwise use macros and DWT_CYCCNT to measure execution cycles by function, which is probably the best way to go for us here. There are even available implementations like https://github.com/Serj-Bashlayev/STM32_Profiler. DWT_CYCCNT is a standard M3 core register, so available on the GD32. That library prints using SWO (once again), but it could also print to a debug output after a fixed time profiling.

Or it would be easy to enable DWT_CYCCNT once at the beginning, create a structure for the various functions we use, store the DWT value upon entering the function, subtract the value at the end from the start, and add the instruction count to the right element of the structure for that function

//the following code needs to run only once when enabling the profiling

// enable DWT
CoreDebug->DEMCR |= 0x01000000;
// Reset cycle counter
DWT->CYCCNT = 0;
// enable cycle counter
DWT->CTRL |= 0x1;

// for each function
uint32_t start = DWT->CYCCNT;

// code to be measured here

structure.this_element += DWT->CYCCNT-start;

Then stop the code after, say, 1000 loops and print the values in the structure. the DWT counter can go up to 4294967295, so at 48MHx, as long as the profiling session ends within 89 seconds of the DWT->CTRL |= 0x1 instruction, we won't have to worry about handling the counter overflow

Alas, I can't profile on the Bluepill dev board. Without a motor and hall sensors, the code won't work as expected. If you think you can run the profiler, I'd be happy to then analyze the code and see where we can gain. If you can't, I'll have to set up my only board and hope for the best

RoboDurden commented 1 year ago

Well all the many lines you wrote here about such a profiler does not make me wanting to do it at all :-/ My dso single channel did show that the simpleFOC and gen2.x waveforms are quite alike. So.why not find a clever way to see what is different... I think candas mentioned a different alignment. I think I have a cheap multi channel logic analyser in my train station but I forgot to take it with me to my solar warehouse. Logging the three hall inputs and some gate outputs might shed some light in what simpleFOC is doing differently or badly.

If there are some delay parameters we could program an Auto-Tuning that optimizes rpm with a steepest descent over all parameters..

I would prefer some direct approach instead of blindly increase analyzing..

Good night for today's

robcazzaro commented 1 year ago

Profiling is not blindly increasing analyzing 😉. It's looking for inefficient areas of the code that are holding back performance. If we are performance limited now, it's only going to get worse once we add the more complex elements of the SimpleFOC code. Insufficient processor performance is one of the most common causes of problems reported in the SimpleFOC forums.

You can't examine external signals to see where the problem is. Let's assume we find that the PWM waveform is 3% delayed compared to other code. How would we go about changing that, if we don't know what's causing it?

The main take away from my previous comment is that measuring performance is as easy as adding 2 instructions, one at the beginning, one before the exit of each function, plus half a dozen at the end of setup(). If we use macros for the beginning/end of each function, we can enable/disable perf measurement during development. And that will benefit everything we do and help optimize in the future.

That is the clever way to see what's happening, IMHO. But I'd be happy to be proven wrong, if you can find something else that works. And, yes, it might be just a hall sensor alignment, that would be great. I'll hold on doing anything wrt profiling until you fully evaluate that option.

Candas1 commented 1 year ago

Just a reminder. Gen 2.x is running at 72Mhz, the simpleFOC one at 48Mhz.

RoboDurden commented 1 year ago

I simply am the totally wrong guy to analyze bad code instead of writing good code right from the beginning :-/ ?? The GD32 runs at 72 MHz, Gen2.x and simpleFOC both run on the same hardware ??

Speaking of bad code, putting a time critical function in to loop() should be an absolute no-go. It only needs a delay(1); to crush the motor output:

void loop()
{
  // main FOC algorithm function
  // the faster you run this function the better
  // Arduino UNO loop  ~1kHz
  // Bluepill loop ~10kHz 
  motor.loopFOC();
  delay(1);

rpmsounds3

I do not really understand The Gen2.x code comments. bldc.c:

// Calculation-Routine for BLDC => calculates with 16kHz
void CalculateBLDC(void)

Which is called by an interrupt handler in it.c:

// This function handles DMA_Channel0_IRQHandler interrupt
// Is called, when the ADC scan sequence is finished
// -> ADC is triggered from timer0-update-interrupt -> every 31,25us
//----------------------------------------------------------------------------
void DMA_Channel0_IRQHandler(void)
{
    // Calculate motor PWMs
    CalculateBLDC();

1s / 31,25us = 32 kHz, not 16 kHz. But nice idea to trigger the CalculateBLDC() when all adc input are ready to be read. I guess that handler is initialized in setup.c::ADC_Init() :

    // Configure ADC clock (APB2 clock is DIV1 -> 72MHz, ADC clock is DIV6 -> 12MHz)
    rcu_adc_clock_config(RCU_ADCCK_APB2_DIV6);
    // Interrupt channel 0 enable
    nvic_irq_enable(DMA_Channel0_IRQn, 1, 0);

But i do not understand how RCU_ADCCK_APB2_DIV6-> 12 Mhz translates to 32 kHz.

As all three phase channels should be absolute in sync, i also do not understand why three different timers are needed. But maybe all 4 (or 5 for the C8 variant) 16-bit timers are in sync as they are based on one internal counter..

What i like about the Gen2.x code is that it is plain simple. One CalculateBLDC() that runs at whatever kHz and everything is in sync.

I guess we are free to overwrite some simpleFOC classes and replace code the object oriented way. Instead of modififying simpleFOC or GD32-Core files. I am the wrong guy for analyzing code without first having understood what it is intended to do.

RoboDurden commented 1 year ago

okay, max speed is even higher (545 Hz left and 555 Hz right) then Gen2.x when setting these two init values to 20 instead of 26 (Volt):

  driver.voltage_power_supply = 20;
  driver.voltage_limit = 20;

But the current high rises to 850-900 mA instead of the Gen2.x with less than 500 mA :-/ I have the feeling that the problem is not code optimization but wrong pwm alignment and therfore inefficient motor control. But with my 1 channel dso i fear i can not dig deeper.

Update: 22 Volt lead to 537 Hz and 544 Hz at about 700 mA. 24 volt give 502 Hz and 502 Hz (left/right) at about 600 mA.

It think these two parameters get used here:

// Set voltage to the pwm pin
void BLDCDriver6PWM::setPwm(float Ua, float Ub, float Uc) {
  // limit the voltage in driver
  Ua = _constrain(Ua, 0, voltage_limit);
  Ub = _constrain(Ub, 0, voltage_limit);
  Uc = _constrain(Uc, 0, voltage_limit);
  // calculate duty cycle
  // limited in [0,1]
  dc_a = _constrain(Ua / voltage_power_supply, 0.0f , 1.0f );
  dc_b = _constrain(Ub / voltage_power_supply, 0.0f , 1.0f );
  dc_c = _constrain(Uc / voltage_power_supply, 0.0f , 1.0f );
  // hardware specific writing
  // hardware specific function - depending on driver and mcu
  _writeDutyCycle6PWM(dc_a, dc_b, dc_c, phase_state, params);
}

RoboDurden commented 1 year ago

short note to @Candas1 , that add_RTT_task.py needs "\"$PROJECT_PACKAGES_DIR/tool-openocd/bin/openocd\"" + to handle path names with " " spaces.

Nice to have log output without another cable. But the sensor.getVelocity() output is not smooth at all:


GD32: 26500     0: 0    1: 1    2: 1    angle: 103.81   speed: 41.65
GD32: 26600     0: 1    1: 0    2: 0    angle: 107.79   speed: 38.98
GD32: 26700     0: 0    1: 1    2: 0    angle: 111.70   speed: 41.68
GD32: 26800     0: 0    1: 0    2: 1    angle: 115.61   speed: 35.01
GD32: 26900     0: 1    1: 1    2: 0    angle: 119.59   speed: 40.85
GD32: 27000     0: 0    1: 1    2: 0    angle: 123.43   speed: 41.02
GD32: 27100     0: 1    1: 0    2: 1    angle: 127.41   speed: 36.84
GD32: 27200     0: 0    1: 1    2: 0    angle: 131.39   speed: 41.31
GD32: 27300     0: 0    1: 1    2: 1    angle: 135.23   speed: 41.83```

Do not think that the rpm at max speed changes by 40.85/35.01 = 17% within 100ms

Maybe we should indeed implement/derive our own HallSensor class :-/

robcazzaro commented 1 year ago

@RoboDurden the sensor is not filtered in SimpleFOC and very "spiky". There is a new smoothedsensor class being developed for that reason.

The folks running the forum recommend using motor.shaftVelocity() instead of sensor.getVelocity(). That is filtered and smoothed

As for your question on the clocks, the GD32 (like STM32) have multiple buses and clocks, each with PLLs and dividers, and it's a puzzle trying to understand how fast each bus runs. for STM32, there is a function in the STM32CubeIDE that helps calculate each and every PLL and divisor and gives you an idea in MHz or kHz. I haven't checked, but if Gen2.x is running at 72MHz, all bets are off with clocks and timing accuracy and I doubt it's. But if you want to truly understand the actual values, you need to look at the clock initialization and note each and every value being configured, then you can know the actual speed of a specific bus or component (in some cases, the speed can be half of the clock, depending on the edge polarity). Just to give you a sense, here's the clock tree for the GD32F130 (page 79, User manual)

robcazzaro commented 1 year ago

@RoboDurden one more comment re:

void loop()
{
  // main FOC algorithm function
  // the faster you run this function the better
  // Arduino UNO loop  ~1kHz
  // Bluepill loop ~10kHz 
  motor.loopFOC();
  delay(1);

Adding a delay(1) is really a lot in what should be a tight loop. A loop() with only a single instruction of delay(1) runs at less than 1kHz, which is way too low for SimpleFOC. So adding a huge delay of 1msec in that loop will never work. For a more relevant test you might want to use delayMicroseconds(1); If that causes problems, that means that the processor is not fast enough with the current code. If even something like delayMicroseconds(10) causes no problems, that means that the code is fast enough and the problem is not code efficiency (but settings and hall alignment)

The goal is to run the loop at at least 10kHz to have a smooth performance. Alas, SimpleFOC is not timer-driven, but runs in a tight loop which is a bad way to write time-critical code.

From the Gen2.x code I can't understand how the GD32 clock is initialized. Usually there is a call to SystemClock_Config() to initialize the GD32 clock, but the Gen2.x code uses a non standard way.

    //SystemClock_Config();
  SystemCoreClockUpdate();

I don't have Keil, so can't quickly test. I could port the code to VSCode, but it will take some time.

@RoboDurden If you have spare time, could you print the value of SystemCoreClock ? That variable is the GD32 clock (48000000 or 72000000). Knowing it will help making comparisons to the gen2.x code

Candas1 commented 1 year ago

We are free to move loopfoc and move functions to interrupts if we want to

https://community.simplefoc.com/t/simplefoc-theory-timing/2310/2?u=candas1

robcazzaro commented 1 year ago

Good point @Candas1. Just a note that if we do that, we need to change how the hall sensor code works and disable those interrupts. Basically read the hall sensor upon receiving the timer interrupt like Gen2.x does (which would also make the velocity calculation much simpler and faster compared to the current one using the time between changes to calculate velocity, which as Robodurden noticed is very spiky). And make sure that the ADC code doesn't rely on interrupts either (Gen2.x uses ADC interrupts, but there are ways to start the ADC without interrupts). And look at how USART uses the interrupts in the Arduino code

Otherwise we have to deal with re-entrant interrupts and that causes all sorts of problems.

Candas1 commented 1 year ago

Yes we will trigger inserted adc at timer update event.

Candas1 commented 1 year ago

Some good advices were shared just now https://community.simplefoc.com/t/hall-sensor-bldc-motor-does-not-turn/3046/10?u=candas1

RoboDurden commented 1 year ago

In honor of @flo199213 who ported the good old Niklas Fauth hoverboard firmware to the split boards, we should refer to "Gen2" as my "Gen2.x" only added odometer, better serial and different board layouts.

Our main loop of the simpleFOC fimrware normaly takes about 110 us but max 1122 micro seconds, i guess when debug output gets sent:

  long long iMicrosNow = _micros();
  long iMicros = iMicrosNow - iMicrosLast;
  if (iMicrosMax < iMicros) iMicrosMax = iMicros;
  iMicrosLast = iMicrosNow;
  motor.loopFOC();
...
  if (iTimeSend > iNow) return;
  iTimeSend = iNow + TIME_SEND;
  DEBUG( 
    OUT2T("SystemCoreClock",SystemCoreClock ) 
    //for (int i=0; i<HALL_Count; i++)  OUT2T(i,aoHall[i].Get())
    OUT2T("angle",sensor.getAngle()) 
    OUT2T("speed",sensor.getVelocity())
    OUT2T( iMicros , iMicrosMax )
    OUT2T( 1000.0f/iMicros , 1000.0f/(float)iMicrosMax )
    OUTLN()
  )
  iMicrosMax = 0;

SystemCoreClock: 72000000       angle: 284.77   speed: 0.00     116: 307        8.62: 3.26
SystemCoreClock: 72000000       angle: 269.76   speed: -36.21   118: 294        8.47: 3.40
SystemCoreClock: 72000000       angle: 250.98   speed: -40.05   115: 1116       8.70: 0.90
SystemCoreClock: 72000000       angle: 230.66   speed: -44.78   114: 1114       8.77: 0.90
SystemCoreClock: 72000000       angle: 210.49   speed: -40.10   114: 301        8.77: 3.32
SystemCoreClock: 72000000       angle: 190.38   speed: -38.98   113: 1115       8.85: 0.90
SystemCoreClock: 72000000       angle: 170.27   speed: -42.86   113: 1115       8.85: 0.90
SystemCoreClock: 72000000       angle: 150.17   speed: 0.00     113: 301        8.85: 3.32
SystemCoreClock: 72000000       angle: 129.92   speed: -44.98   116: 1114       8.62: 0.90
SystemCoreClock: 72000000       angle: 109.75   speed: -40.80   125: 1122       8.00: 0.89
SystemCoreClock: 72000000       angle: 89.71    speed: -37.17   125: 308        8.00: 3.25
SystemCoreClock: 72000000       angle: 69.60    speed: -37.45   127: 312        7.87: 3.21
SystemCoreClock: 72000000       angle: 49.50    speed: -41.19   126: 313        7.94: 3.19
SystemCoreClock: 72000000       angle: 30.23    speed: -35.66   126: 311        7.94: 3.22
SystemCoreClock: 72000000       angle: 14.59    speed: -27.01   125: 316        8.00: 3.16
SystemCoreClock: 72000000       angle: 3.21     speed: -17.61   123: 308        8.13: 3.25
SystemCoreClock: 72000000       angle: -3.63    speed: -8.89    125: 313        8.00: 3.19
SystemCoreClock: 72000000       angle: -5.59    speed: -0.79    123: 310        8.13: 3.23
SystemCoreClock: 72000000       angle: -4.19    speed: 6.77     126: 1114       7.94: 0.90
SystemCoreClock: 72000000       angle: 1.75     speed: 16.40    117: 302        8.55: 3.31
SystemCoreClock: 72000000       angle: 12.36    speed: 23.44    116: 303        8.62: 3.30
SystemCoreClock: 72000000       angle: 27.09    speed: 33.13    114: 302        8.77: 3.31
SystemCoreClock: 72000000       angle: 45.52    speed: 37.23    113: 302        8.85: 3.31
SystemCoreClock: 72000000       angle: 65.35    speed: 40.35    115: 301        8.70: 3.32
SystemCoreClock: 72000000       angle: 85.24    speed: 41.09    112: 302        8.93: 3.31
SystemCoreClock: 72000000       angle: 105.07   speed: 36.76    113: 300        8.85: 3.33
SystemCoreClock: 72000000       angle: 124.97   speed: 39.38    115: 302        8.70: 3.31
SystemCoreClock: 72000000       angle: 144.65   speed: 39.49    114: 296        8.77: 3.38
SystemCoreClock: 72000000       angle: 164.41   speed: 36.42    113: 302        8.85: 3.31
SystemCoreClock: 72000000       angle: 184.17   speed: 38.72    112: 301        8.93: 3.32
SystemCoreClock: 72000000       angle: 203.99   speed: 41.02    115: 301        8.70: 3.32
SystemCoreClock: 72000000       angle: 223.82   speed: 40.83    114: 1113       8.77: 0.90
SystemCoreClock: 72000000       angle: 243.72   speed: 37.37    115: 304        8.70: 3.29
SystemCoreClock: 72000000       angle: 262.57   speed: 32.56    125: 312        8.00: 3.21
SystemCoreClock: 72000000       angle: 277.93   speed: 26.89    124: 310        8.06: 3.23
SystemCoreClock: 72000000       angle: 289.17   speed: 18.26    125: 311        8.00: 3.22
SystemCoreClock: 72000000       angle: 295.94   speed: 8.62     123: 308        8.13: 3.25
SystemCoreClock: 72000000       angle: 297.89   speed: 0.00     116: 313        8.62: 3.19

So 0.9 kHz some times is not acceptable and normal Arduino users will load the loop() with lots of i2c modules and even wlan or bluetooth anyway.

I think we should move on with low-sde current sensing because that will come with better timing than the hall sensors anyway ? When porting that low-side current sensor to GD32 we might want to introduce a new timer interrrupt for adc sampling anyway. Like letting it "called, when the ADC scan sequence is finished" = https://github.com/RoboDurden/Hoverboard-Firmware-Hack-Gen2.x/blob/main/HoverBoardGigaDevice/Src/it.c#L137 When we have such a 16kHz (or 32 kHz) interrupt we can overwrite the HallSensor class and call motor.loopFOC(); from there. That might simplify the alignment and we better understand what the difference to the good old Gen2 code is.

Candas1 commented 1 year ago

Yes we can use the adc interrupt, but we don't need the dma.

Let's implement, we will see what needs to be optimized.

I think segger rtt shouldn't introduce any delay, it's just reading from memory.

robcazzaro commented 1 year ago

Sounds like a good plan. Please assign to me any low level issue and I'll take care of it.

@RoboDurden thanks for verifying that Gen2 actually runs at 72MHz, which is a very high overclocking (50% above rated). While that might be ok for basic core math and some GPIO, more complex functions like PWM and especially ADC will be impacted. We should keep running at the rated 48MHz, even if that has a performance impact

@Candas1 RTT does have a small performance impact when sending data, but much less than any other channel (like UART), since it uses the existing SWD protocol. Having no debug is the best option for speed, but if any output is needed, RTT is by far the best option

RoboDurden commented 1 year ago

According to the manual the GD32F13x is a 72 Mhz chip: GD32F13x specs

robcazzaro commented 1 year ago

@RoboDurden not sure where you found that, but the datasheet has only 48MHz for the GD32F130xx https://www.gigadevice.com.cn/Public/Uploads/uploadfile/files/20230314/GD32F130xxDatasheetRev3.7.pdf and in this one https://www.gigadevice.com.cn/Public/Uploads/uploadfile/files/20230209/GD32F1x0_User_Manual_EN_Rev3.6.pdf it says Note:(1) The maximum operating frequency of GD32F150xx is 72 MHz, and the maximum operating frequency of GD32F130xx is 48 MHz.

It looks like 72MHz was corrected in a revision 3.2 1. Modify 72MHz system frequency to 42MHz. Jul.25, 2019. The GD32F150 runs at 72MHz, so maybe that's where the confusion comes from. It's entirely possible that most of the times it will work at 72MHz, but given how critical the correct working of the firmware is, I would not like to take that risk

RoboDurden commented 1 year ago

GD32F130K6U6_C81552.pdf I found this datasheet after the others did not have the K6 package pinout..

robcazzaro commented 1 year ago

Well, as you can see, that is an old revision from 2018. After 2019 all versions have the 48MHz clock

RoboDurden commented 1 year ago

@Candas1 i will delete my original Split_Hoverboard_SimpleFOCrepo so i can fork your repo, add the LowsideCurrentSensecode and make a pull request. Then @robcazzaro if he wish can start with a SimpleFOC/src/current_sense/hardware_specific/gd32/gd32_mcu.cpp. If not, i would try to port the Gen2 analog read of

typedef struct
{
  uint16_t v_batt;  
  uint16_t current_dc;
} adc_buf_t;

by adding the two low-side currents like

typedef struct
{
  uint16_t v_batt;  
  uint16_t current_dc;
  uint16_t current_B;  
  uint16_t current_C;  
} adc_buf_t;

As the Gen2 code does read v_batt and current_dc at 16 kHz (or 32 kHz?), i think that should also be good for the LowsideCurrentSense class.

Candas1 commented 1 year ago

I think we need to do something different. We would sample currents with inserted adc, and read anything else separately. We don't need battery voltage 16000 times a second, it will only delay the current sampling and foc calculation.

RoboDurden commented 1 year ago

google gives me no hits for "inserted adc" or "inserted adc stm32" or "inserted adc gd32" :-/ That Gen2-FOC-port you host also adds the low-side currents to the one adc_buffer:


typedef struct
{
  uint16_t v_batt;  
  uint16_t current_dc;
  uint16_t phase_a;  
  uint16_t phase_b;
  uint16_t mcu_temp;
} adc_buf_t;

https://github.com/Candas1/Hoverboard-Firmware-Hack-Gen2/blob/master/HoverBoardGigaDevice/Inc/defines.h#L49C1-L57C1

Here the GD32 init code: https://github.com/Candas1/Hoverboard-Firmware-Hack-Gen2/blob/master/HoverBoardGigaDevice/Src/setup.c#L368

With simpleFOC i think, the current_sense is synced to the motor which is simply triggered in the loop() :-( What might work is to not motor.linkCurrentSense(&current_sense); at all and instead do the motor.loopFOC(); from our LowsideCurrentSenseimplementation when the ADC scan sequence is finished and DMA_Channel0_IRQHandler(void) gets called :-)

That would be my approach without knowing what "inserted adc" is.

robcazzaro commented 1 year ago

@RoboDurden Gigadevice calls it inserted ADC, STM calls it injected ADC. It's a way to have multiple channels start ADC when an event (e.g. a timer) triggers it, without software assistance. SImpleFOC uses it to ensure that the current sensing is always time-aligned with the PWM. Adding interrupt servicing to the code will make it non deterministic (compared to a purely HW-based solution like injected ADC).

Here's a relevant snippet from the SimpleFOC forum


So at the moment, to avoid potential problematic situations due to the ADC conversion lasting to long and entering the high-side activation of certain mosfets we suggest using the PWM frequencies that do not exceed 20kHz. This in most cases results in behavior.

I'll leave it to you two to decide the best way to structure the high level code, you both have a lot more experience with motor control. I'll be ready to help on anything once the main structure is in place.

Candas1 commented 1 year ago

Injected adc is also what the simpleFOC current sense driver is using. I have experience with injected adc and know what to do, so please leave it to me, I will check when I am back. I think there is a lot more in the to-do list.

And let's discuss adc in a new thread.

Are we done with this thread? Now we are able to go as fast as gen2.X, but not at maximum target value?

RoboDurden commented 1 year ago

max speed like Gen2 is reached but at the cost of 850-900 mA instead of the Gen2.x with less than 500 mA :-/ If the Gen2 code has a motor efficiency of 80%, then with simpleFOC we have 42% :-( I wonder how max speed (with no load) can be reached with such a low motor efficiency. But when moving from the NiklasFauth (that Gen2 is based on) to EFeruFOC it was like having new motors in my solar vehicle: less sound, more torque, less current (indeed i never tested for max speed). So maybe Gen2 is below 80% efficieny.

So no, we are not really able to go as fast as gen2.

P.S. i deleted my original Split_Hoverboard_SimpleFOC, did fork your version here, cloned my fork to Github-Desktop, made some code improvements and added current_sensor code and now Github-Desktop hangs and i can not push my changes to my fork (in order to make a pull request).

I guess deleting my original repo to replace it with your fork of my original now causes an infinite loop :-( Ideas welcome :-)

Candas1 commented 1 year ago

I am still betting on sensor alignment. Current sensing/foc won't solve this issue. Current sensing/foc won't be beneficial without the precise angle the next release will bring with sensor smoothing.

RoboDurden commented 1 year ago

I wrongly thought that current sensing somehow replaces the hall sensor angle and gives more precise angle ? But the lowside current sensing will be worth nothing without a precise angle derived from a hall sensor smoothing ?

With the current HallSensor and it's three channel callbacks, the hall position is immediately updated i guess - but the pwm calculating motor.loopFOC(); is not called peridically but from every 112 ms up to 1122 ms :-( I fear that sensor alignment can not really work that way and we better sync Hall + Current + Pwm to one 16 kHz timer.

But you say that with hall sensor smoothing, the motor.loopFOC(); will get more precise angle and it only matters that the loopFOC is called often enough.

Then hall sensor alignment (to phase pwm) is not a problem because only one precise angle is important ? Good to hear that the simpleFOC community is already working on such a hall sensor smoothing.

But current-sensor to phase pwm alignment will be very important, so we could continue with issue #6

Candas1 commented 1 year ago

Without smoothing, simplefoc only has one angle per hall position. With smoothing, the angle will be extrapolated between 2 hall positions. FOC needs that angle, and phase currents.

Smoothing wasn't available before because simplefoc was mainly used with encoders, they are much more precise than hall sensors already. I think the hall sensor support in simplefoc is not very old, so there is for sure room for improvement and contributions.

Sensor alignment initialization doesn't seem very accurate so we need some manual finetuning. This is not happening in the loop, and just gives you a zero electrical angle offset.

RoboDurden commented 1 year ago

I still do not know what you mean with "sensor alignment" :-/ You mean "hall sensor alignment" ? But alignment to what ? As the motor.loopFOC() is not aligned to anything, we can not align the hall sensor changes to motor.loopFOC() -> setPhaseVoltage() -> driver->setPwm(Ua, Ub, Uc) And the offset of the three pwm do only need to be aligned to each others, not to the hall sensor changes. And the three phase pwm are outputs not sensors.

So what do you mean with "(hall) sensor alignment" ?

Candas1 commented 1 year ago

A motor is an electric magnet, using the 3 phases, you generate a 360 degrees magnetic angle. This angle doesn't match with the angle you get from hall sensors.

RoboDurden commented 1 year ago

So you say that the angle doesn't match (you call it "not aligned") because the hall sensor is discrete and we need a smoothed hall sensor output. Then such a precise angle will really describe the current motor position and the motor.loopFOC will work better. So you do not mean an alignment of digital rising/falling edges but simply a more precise motor position. That has nothing to do with timers or interrupts :-)

But then still i wonder why Gen2.x is doing it so much better ??? Gen2 is not even updating the hall inputs with rising edge interrupts but simply samples them and does not smooth them either.

Candas1 commented 1 year ago

Your question was about the alignment, the precise angle is another topic. Forget it, there is a lot of litterature about that, I can't summarize it all here.

The gen2.x firmware is only for hoverboards so it's already tuned.

RoboDurden commented 1 year ago

The gen2.x firmware is only for hoverboards so it's already tuned.

Sorry Candas, i don't want to spoil your holidays but that sounds nonsense to me. There is no tuning at all in the Gen2 firmware.

// Calculation-Routine for BLDC => calculates with 16kHz
void CalculateBLDC(void)
{
...
  // Read hall sensors  
  hall_a = gpio_input_bit_get(HALL_A_PORT, HALL_A_PIN); 
  hall_b = gpio_input_bit_get(HALL_B_PORT, HALL_B_PIN);
  hall_c = gpio_input_bit_get(HALL_C_PORT, HALL_C_PIN);

  // Determine current position based on hall sensors
  hall = hall_a * 1 + hall_b * 2 + hall_c * 4;
  pos = hall_to_pos[hall];
...
 // Update PWM channels based on position y(ellow), b(lue), g(reen)
  blockPWM(bldc_outputFilterPwm, pos, &y, &b, &g);

The only paramater (that could be fine tuned) i find is in config.h: #define DEAD_TIME 60 and is used to setup TIMER_BLDC, just like you did with the same 60 in drivers/.../gd32_mcu.h

Maybe it would be nice to have a FOCModulationType::BlockCommutation in BLDCMotor::setPhaseVoltage(float Uq, float Ud, float angle_el) that is a stupid simple as the blockPWM(int pwm, int pwmPos, int *y, int *b, int *g) of Gen2 Because for that the inacurate motor angle from the 3 hall sensors would matter just as less as it does for Gen2

I guess your
motor.foc_modulation = FOCModulationType::SinePWM; // choose FOC modulation (optional) is very sensitive to the

float HallSensor::getMechanicalAngle() { return ((float)((electric_rotations 6 + electric_sector) % cpr) / (float)cpr) _2PI ; }

and we need some smoothing there before SinePWM makes any sense. As i said a long time ago, i could quickly add a quadruatic interpolation one liner to that function.

RoboDurden commented 1 year ago

btw, i introduced the BLDC_POLE_PAIRS 15 only after a short google search: "Standard 6.5 inch hoverboard hub motors have 30 permanent magnet poles, and thus 15 pole pairs.". Hope that is correct.

RoboDurden commented 1 year ago

Okay i quickly added a linear interpolation which should also have an effect (on max speed) if sensor smooth would be the problem - yes i see a higher max speed

4° coarse angle: 459 Hz / 471 Hz with linear interpolation 477 Hz / 486 Hz

That is a 3.5% increase in max speed. Current consumption is also about 50m less and quite like the Gen2. But Gen2 reaches 547 Hz at the same current consumption.

199: 538        angle: -16.48   speed: 0.00     51325: -0.07
215: 501        angle: -15.50   speed: 5.36     13013: 0.07
200: 1215       angle: -10.40   speed: 0.00     4477: 0.07
216: 497        angle: -1.12    speed: 23.55    2965: 0.07
215: 1215       angle: 12.01    speed: 31.04    2249: 0.07
213: 543        angle: 28.00    speed: 30.01    2326: 0.07
227: 526        angle: 44.12    speed: 33.99    2054: 0.07
227: 547        angle: 60.18    speed: 31.40    2223: 0.07
228: 545        angle: 76.24    speed: 32.53    2146: 0.07
228: 1228       angle: 92.29    speed: 34.16    2044: 0.07
227: 549        angle: 108.49   speed: 35.21    1983: 0.07
229: 1226       angle: 124.41   speed: 30.82    2265: 0.07
228: 562        angle: 140.60   speed: 35.05    1992: 0.07
227: 544        angle: 156.66   speed: 32.44    2152: 0.07
228: 1228       angle: 172.86   speed: 31.25    2234: 0.07
226: 550        angle: 188.91   speed: 33.06    2112: 0.07
216: 540        angle: 204.90   speed: 29.80    2343: 0.07
215: 532        angle: 218.59   speed: 0.00     2732: 0.07
201: 1217       angle: 228.57   speed: 15.07    4634: 0.07
216: 532        angle: 234.36   speed: 7.86     8882: 0.07
201: 538        angle: 235.83   speed: 0.00     41090: 0.07
215: 486        angle: 234.92   speed: -6.10    11438: -0.07
217: 538        angle: 229.76   speed: -15.66   4459: -0.07
213: 1228       angle: 220.40   speed: -21.47   3252: -0.07
217: 1228       angle: 207.21   speed: -28.45   2454: -0.07

The first numbers of the last colum are the micro seconds between each change of the hall sensors. Even at max speed, there is a 2 ms gap that gets filled with my linear interpolation. And as the motor speed changes slowly, the linear inerpolation should already give a far more precise angle then the coarse 0.07 deg steps.

The 0.07 degree is 2 * pi / (4°/360°). and as our hoverboard motors generate 90 ticks per revolution, that is 4° per hallsensor callback :-)

Here my simple linear interpolation:


float HallSensor::getMechanicalAngle() 
{
  long iTimeSinceLastUpdate = _micros() - aiTimeLastUpdate[0];
  float fGradient1 = (float)(aiAngleLastUpdate[0]-aiAngleLastUpdate[1]) / (aiTimeLastUpdate[0]-aiTimeLastUpdate[1]);
  long fAngleSteps = aiAngleLastUpdate[0] + (long)(fGradient1 * iTimeSinceLastUpdate);
  return ((float)(fAngleSteps % cpr) / (float)cpr) * _2PI ;

  return ((float)((electric_rotations * 6 + electric_sector) % cpr) / (float)cpr) * _2PI ;
}

Maybe i could now "fine tune" by adding a 0.00x angle offset so the motor.loopFOC() -> setPhaseVoltage() -> driver->setPwm(Ua, Ub, Uc) sees a slightly advanced motor angle.. But i think their FOC algorithms should be exact and not need such a tweaking

I am not that eager to advance from my linear interpolation to a more general https://de.wikipedia.org/wiki/Spline-Interpolation

Candas1 / Split_Hoverboard_SimpleFOC

motor does not reach max speed like old Gen2.x binary #5