repetier / Repetier-Firmware

Firmware for Arduino based RepRap 3D printer.
815 stars 734 forks source link

STM32F4/RUMBA32 Performance optimizations #1000

Closed AbsoluteCatalyst closed 4 years ago

AbsoluteCatalyst commented 4 years ago

I mirrored my raw timer IRQHandler performance tweaks I did for the STM32F1 onto the STM32F4. Cuts down the time spent processing timer IRQ's by skipping all the heavy STM HAL layers and HardwareTimer. I do this by renaming the original IRQ's with a custom CMSIS startup file, then using them in our HAL.cpp.

Slightly experimental as I don't actually own a RUMBA32, but a Nucleo-f466re. Though I examined and debugged it with it's st-link and my saleae, everything appears to be in working order.

Using the DeltaTower Rumba32 configuration as a base & moving X, Y, Z axes at maximum speed: Previous IRQ exec time: 2.81us @ 192KHz stepper frequency New IRQ exec time: 796.14ns @ 600KHz stepper frequency (duty cycle/cpu usage @ 600KHz matches up with the previous 192KHz)

600KHz is probably a bit unnecessary, 400KHz might be good enough for general purpose.

This unfortunately makes the STM32F4 HAL just slightly more complicated. Though I think it's worth it for the gain. Even at lower speeds, there's more cpu time for other tasks.

Important bits:

Try it out when you get a chance! :) (Or anyone else with a RUMBA32 and fast feedrates for that matter.)

repetier commented 4 years ago

Did not think that the STM handling is wasting so much time. With your timings that means 2000ns on extra checks etc. That will surely give it a good performance gain. And I don't think the limit previously was 200khz. The real limit was hit when homing since the extra tests for end stops cost additional time. But now we have plenty for any printer with that processor.

One problem is that you include new .c files in boards. It now works because it is the only one added to boards/STM32F4 but I expect soon to need to add a new board and then I'd have 2 sets of same variant files. Of course I can exclude 1 board when the other is used but with increasing boards this will get unhandy.

So what do think when we make a new folder boardfiles and move all the files with variants etc. there - one dir per board - and add that as well to src_filter. That way there is no need to fix all existing boards when a new one gets added.

repetier commented 4 years ago

Oh forgot - the delta in sample definition is where I have put the rumba32 on, so will test how it works. Only problem is that is currently not well calibrated so printing needs some adjustments. But moving and heating should work fine.

AbsoluteCatalyst commented 4 years ago

Yeah I was a little surprised too when I found out how much time those STM HAL handlers were taking... they're doing a massive amount of non-inlined function calls and checks last I went through it.

So what do think when we make a new folder boardfiles and move all the files with variants etc.

This would be a really good idea! I did find my variant files setup a little messy! I think I might not actually need to have PinNamesVar.h too.

repetier commented 4 years ago

Ok, will the move these files and adjust platformio.ini to include them correctly.

AbsoluteCatalyst commented 4 years ago

PS. Do you have an updated Todo list for V2 somewhere? (besides the V2 readme)

repetier commented 4 years ago

Ok, have moved the files an renamed .s to .S because otherwise I get a undefined -x option. Maybe no problem under windows but on mac and I think linux as well is seems to be problem.

No I have no written todo and readme might be outdated a bit. Need to recheck contents.

Currently on my todo is:

That is more or less what I need to achieve to make it officially a stable version. And of course I want a cool interactive config tools or users will make to many support questions. I think here like making each module a block having inputs you need to satisfy. So you can insert blocks only after these inputs are defined. With some pre defined structure based on good defaults.

AbsoluteCatalyst commented 4 years ago

Weird, the normal CMSIS files in the library are .s too. It compiles fine with .S on windows though so I guess no issue here.

Aye, the readme is a little outdated.

I think RAMPS might basically be the last/only 8bit board people still use at this point tbh. (Edit: forgot about the original rumba too) Amazing that RFW V2 might fully fit in it's flash even with so many features turned on. E3 mini had the same flash size. (Trying to get BTT's marlin fork to fit with barely anything turned on was impossible for whatever reason. Haha.)

I do wonder how it'll perform at such a lower cpu speed though.

By interactive config tools you mean your online configurator? The module system has worked really well for us in V2. Something similar as you mention on the configurator would be pretty neat. (BTW thanks for clearing up the main configuration.h file)