pixelmatix / SmartMatrix

SmartMatrix Library for Teensy 3, Teensy 4, and ESP32
http://docs.pixelmatix.com/SmartMatrix
618 stars 162 forks source link

Major Refactoring - SmartMatrix 3.x #25

Closed embedded-creations closed 9 years ago

embedded-creations commented 9 years ago

https://github.com/pixelmatix/SmartMatrix/tree/sm3.0

I'm working on a major revision to the SmartMatrix Library that simplifies the SmartMatrix Class into just refreshing the display, and moves the foreground and background layers into separate Layer classes. Instead of using a unique header file to describe the hardware, the configuration is set in the user's sketch, so you can have one version of the library and use it with different sizes of displays. You can create your own Layer outside of the library - for example I made a Fadecandy Layer class that I used to convert the buffers from a port of Fadecandy to SmartMatrix - and you can make your own stackup of layers, with multiple Foreground layers for example.

This requires a lot more code to set up the classes in the sketch, which I was trying to minimize with the initial release of the library, but I think this more flexible approach supports more use cases. The code is hidden away by using a couple #defines that support the most common use cases, and we can add more. I chose to declare buffers in the sketch instead of using malloc in the classes as I wanted the large buffers to show up in memory usage at the end of compilation, and there's no malloc support for creating DMAMEM buffers.

After separating into classes, refactoring to be more generic (no dependencies on MATRIX_WIDTH/HEIGHT defines) and testing on both 16x32 and 32x32 panels, I found the classes worked with 32x64 with barely any changes. I didn't get a chance to use any code from the >32x32 size pull requests from @ncortot, @GaryBoone, and @mrwastl, though maybe I will when adding support for C-shape chaining.

I have the library in a decent state, and FeatureDemo has been tested on 16x32, 32x32, and 32x64 panels. There's an occasional flicker in the 32x64 I haven't tracked down yet. Only the FeatureDemo example is updated to use the new library, though it should be easy to update the rest. (Except for FastLED_Controller, which will need an update to the FastLED Library with SmartMatrix 3.x support)

It's probably a month or more away from an official release as I'm going to be traveling without a laptop for a couple weeks, and want to add support for more display configurations (especially C-shape paneling), to do more testing, and make some efficiency improvements after I get back.

If there are any other features you'd like to see in the next major release, let's start discussing.

Not yet supported:

mrwastl commented 9 years ago

@embedded-creations the only little small thing that comes into my mind: rotation 90 / 270 scrolling doesn't go very smooth.. (don't know how to describe it: it looks like 'stepping' or a little bit like tearing or so).

it's noticeable but i wouldn't count it as a 'showstopper bug'.

if you're done with defining the API i will try to add different colour spaces (16bit and 8bit) - if i get used to the template-stuff ... and also port the extended spectrum analyzer and my teensy-rtc based MatrixClock3 to SM3.0 (don't know yet if i should create a separate git-project for each of these or a tools-project that includes all these sketches)

embedded-creations commented 9 years ago

@mrwastl Yes, I'm done defining the API.

I think now that templates are in place adding more color spaces I hope should be pretty easy, though I may be overlooking something. MatrixCommon.h has the definitions for color spaces. There's a struct, operators to convert to other color spaces, and a colorCorrection function that need to be created for each new color space. The color correction in SMLayerBackground::fillRefreshRow may cause some trouble. Consider replacing lookups in backgroundColorCorrectionLUT with one of the static lightPowerMap tables for now, and I will think more about how to improve background color correction in a future release so it can be used with other color spaces.

If you would like the extended spectrum analyzer and teensy-rtc based MatrixClock sketches to be included in a future SmartMatrix release as replacements/improvements to the existing examples, then you could work on them as branches of SmartMatrix. If not, separate repos would probably be better than one repo.

embedded-creations commented 9 years ago

@mrwastl I unfortunately can't reproduce the drawHardware* crashing. I did see a crash during DEMO_DRAWING_INTRO but I'm not sure what caused that as so many drawing methods are tested there. Can you upload a GIST of your FeatureDemo.ino that is crashing? If possible, just enable one or a few of the demos that are causing crashing so it's quick to reproduce.

I reviewed all the code related to fillScreen, drawFast*Line and drawHardware and only found one mistake: fillScreen tries to fill the size of the screen plus one line on each side, but drawFastHLine limits the size to the actual screen coordinates, so that's not a source of crashing.

embedded-creations commented 9 years ago

@mrwastl

rotation 90 / 270 scrolling doesn't go very smooth

I didn't see any obvious bugs when I changed featureDemo to run at 90/270 rotation, but the text did look like it was jumping a bit.

I think this is related to the movement of the scrolling and the multiplexing of the panels. As the text is moving, the scanlines are also moving in either the same or opposite direction (depending on 90/270). At the middle of each panel, the top half and bottom half meet. Row 15 is refreshed at the end of the frame, row 16 is refreshed at the beginning of the frame, so text may have moved in the very short time between row 15 being visible and row 16 being visible. There's not much I can do about this. I tried randomizing or using a pattern to go through the rows instead of just scanning from top to bottom, but this looked much worse, especially when moving your head or at lower refresh rates.

mrwastl commented 9 years ago

@embedded-creations jumping text: yes, that's what i meant. only a 'visual' flaw, nothing more.

crashes: now THAT was a WTF-experience:

i played around with my fillScreen-method and narrowed the crash to this (with kRefreshDepth = 24):

void VSSDCP_smartmatrix::fillScreen (uint32_t col) {
  col2RGB(col);
  //backgroundLayer.fillScreen(tempcol);
  backgroundLayer.fillRectangle(0, 0, width-1, height-5, tempcol);
  backgroundLayer.drawFastHLine(0, width/2+24, height-4, tempcol);
  backgroundLayer.drawFastHLine(width/2+26, width-1, height-4, tempcol);
  backgroundLayer.fillRectangle(0, height-3, width-1, height-1, tempcol);
  // backgroundLayer.drawPixel(width/2+25, height-4, tempcol);   // enabling this one: crash
}

(= the whole screen is filled except one pixel)

kRefreshDepth = 36 seems to crash on a different position.

that was a real big WTF!

so i scanned through my protocol library again: there is one malloc (it allocates 256 bytes) that assignes an array to a member variable that is defined in the super-class. so i added a static array to the class and assigned this one to the super-class member var. (to avoid malloc) -> no more crash.

memory shortage should not be a problem (around 12k still free). and the malloc was there for quite a long time ... but suddenly (from one version to another): crash. i don't get it. some DMA / malloc interference?

embedded-creations commented 9 years ago

@mrwastl the call to malloc is only made once? at what point in the program flow?

mrwastl commented 9 years ago

@embedded-creations yes, in the class-constructor of the super-class (VSSDCP_base -> VSSDCP_serial -> VSSDCP_smartmatrix, malloc was in the class-constructor of VSSDCP_serial.

update: to be more precisely: the buffers / layer are 'allocated' outside of any class directly in the main sketch (through macros SMARTMATRIX_ALLOCATE_BUFFERS and SMARTMATRIX_ALLOCATE_BACKGROUND_LAYER: therefore layer/buffer-memory -> class-constructors -> initialisation of SmartMatrix -> matrix.begin()

embedded-creations commented 9 years ago

@mrwastl Unfortunatlely I don't have any insight here. It sounds like a bug in either the Teensy code, maybe in the linker script, or possibly GCC, but I think you'd want to have a simplified example to prove to someone that there's a problem that needs to be fixed. Maybe you could make a simple sketch that allocates some large DMAMEM buffers, fills them with a pattern, uses malloc, and shows that the pattern is changed after. I'm not sure what to suggest right now other than don't use malloc.

edit: changed "don't think" to "think"

embedded-creations commented 9 years ago

@mrwastl I'm currently tracking down crashing from fillScreen in the Bitmaps example at 64x64 resolution. It seems to crash when fillScreen on the second half of backgroundBuffer. If I replace the drawHardwareHLine direct buffer writes with calls to drawPixel, it still crashes. If I have fillScreen call memset() it doesn't crash. Not sure what's going on, it's hard to track down, but it's likely this is what's causing your crashes, not a malloc/DMA conflict.

embedded-creations commented 9 years ago

@mrwastl Tracing it down to the rgb24 assignment operator. After modifying drawHardwareHLine to call drawPixel, and this modification to drawPixel, fillScreen works without crashing:

    currentDrawBufferPtr[(hwy * this->matrixWidth) + hwx].red = color.red;
    currentDrawBufferPtr[(hwy * this->matrixWidth) + hwx].green = color.green;
    currentDrawBufferPtr[(hwy * this->matrixWidth) + hwx].blue = color.blue;

This original code crashes:

    currentDrawBufferPtr[(hwy * this->matrixWidth) + hwx] = color;

I'm guessing the assignment operator is stepping on other memory. Need to figure out how assignment operators work...

edit: not copy constructor, assignment operator

embedded-creations commented 9 years ago

@mrwastl Pushed the fix - I can't explain it but the implicit assignment operator for rgb24 seems to be stepping on other memory. Added my own definition and it works. Hopefully you can verify this fixes your issue with malloc.

mrwastl commented 9 years ago

@embedded-creations yep, this one did the trick. no more crash with fillScreen() and malloc.

mrwastl commented 9 years ago

@embedded-creations as you're polishing the code for release: one tiny typo: the comments for width and height seem to be swapped.

embedded-creations commented 9 years ago

@mrwastl Good catch, thanks!

gregfriedland commented 9 years ago

I haven't been following this discussion fully so apologies if the following question is obvious.

I'm trying to see how many frames per second I can get with a 64x32 array (with latest on sm3.0) and am trying various parameters. COLOR_DEPTH of 48 is slower than 24 which makes sense but changing kRefreshDepth from 24 to 48 seems to have no effect. Does that make sense and if so, then what's the advantage of the lower refresh depths?

On Thu, Sep 10, 2015 at 11:18 AM, Louis Beaudoin notifications@github.com wrote:

@mrwastl https://github.com/mrwastl Good catch, thanks!

— Reply to this email directly or view it on GitHub https://github.com/pixelmatix/SmartMatrix/issues/25#issuecomment-139333147 .

embedded-creations commented 9 years ago

@gregfriedland It's not obvious. The difference is CPU time (though it sounds like this is minor), and memory usage: 24 uses a smaller temp buffer than 36 and 48 and the buffers set by kDmaBufferRows scale with refreshDepth.

The more bits you have the dimmer the display at 100% brightness. This comment might explain more: https://github.com/pixelmatix/SmartMatrix/issues/25#issuecomment-137146116

gregfriedland commented 9 years ago

Those other differences make sense, although I'm curious why the cpu time isn't drastically different. I think you are probably using Bit Angle Modulation (is that right?), so shouldn't it be possible to refresh the display at refreshDepth = 24 bits twice as fast as refreshDepth = 48 bits?

embedded-creations commented 9 years ago

Yes, using Binary Code Modulation (BCM) (aka BAM), but the limitation on refresh rate is CPU time not bit depth. Each row is computed as needed, once per frame, and after a certain point the CPU is spending 100% of the time just preparing rows for refresh.

shouldn't it be possible to refresh the display at refreshDepth = 24 bits twice as fast as refreshDepth = 48 bits

SmartMatrix assigns a number of timer ticks per row based on the refresh rate and number of rows to refresh. It takes the row time and divides it up into the number of BCM bits, keeping in mind there's a minimum time to shift out data, so each bit has a minimum time. It's a little extra CPU time to prep 36 or 48-bit vs 24-bit, loading a little more data into the buffers, but the rest is taken care of by DMA with little effect on CPU. Here's an example of refreshing focused on a single row with 36-bit color. You can see 12-bits all updated sequentially. If it were 24-bit color, the time between the "1" and "2" flags would be the same, but there would only be 8 latches and each bit would be a bit wider. Sorry this probably isn't a good explanation, I need to do a long writeup on how everything works, it's hard to explain in a single paragraph with one picture. qfyhb2p5tkwirwpaigjyjmnwo9ihuadmt2zciizrchg

embedded-creations commented 9 years ago

Beta version is released: https://github.com/pixelmatix/SmartMatrix/releases/tag/3.0-b1

I'm going to close this conversation as it's gotten extremely long, thanks for all the contributions, feedback, and testing!