hzeller / rpi-rgb-led-matrix

Controlling up to three chains of 64x64, 32x32, 16x32 or similar RGB LED displays using Raspberry Pi GPIO
GNU General Public License v2.0
3.67k stars 1.17k forks source link

occasional OverflowError with SetImage from Python #277

Closed partofthething closed 7 years ago

partofthething commented 7 years ago

Hi. I'm having an amazing time with this library and have really brought my screen to life (really, I have pet Giraffes on it now). Anyway, I was adding a new scene to my library of scenes from Python that is intended to just show a static image. It's in an animation loop (I may have it move later) so it's all contained in a double-buffered loop like some demos. The relevant code is roughly:

class FullImage(Scene):
    """Full screen bitmap image."""
    def __init__(self, disp, fname):
        Scene.__init__(self, disp)
        self.image = Image.open(fname)
        self.image.thumbnail((self.disp.matrix.width, self.disp.matrix.height), Image.ANTIALIAS)
        self.image = self.image.convert('RGB')

    def draw_frame(self):
        self.disp.canvas.Clear()
        self.disp.canvas.SetImage(self.image)
        self.disp.canvas = self.disp.matrix.SwapOnVSync(self.disp.canvas)

Now, half the time, this works and the image shows nicely. But the other half, the whole thing crashes and I get this traceback:

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/pi/led-infopanel/rgbinfopanel/__main__.py", line 4, in <module>
    display.run()
  File "/home/pi/led-infopanel/rgbinfopanel/display.py", line 188, in run
    display.run()  # main thread
  File "/home/pi/led-infopanel/rgbinfopanel/display.py", line 50, in run
    self.draw_frame()
  File "/home/pi/led-infopanel/rgbinfopanel/display.py", line 72, in draw_frame
    self.scene.draw_frame()
  File "/home/pi/led-infopanel/rgbinfopanel/scenes.py", line 49, in draw_frame
    self.disp.canvas.SetImage(self.image)
  File "rgbmatrix/core.pyx", line 22, in rgbmatrix.core.Canvas.SetImage (rgbmatrix/core.cpp:1605)
  File "rgbmatrix/core.pyx", line 46, in rgbmatrix.core.Canvas.SetPixelsPillow (rgbmatrix/core.cpp:2339)
OverflowError: can't convert negative value to size_t

Not sure if it matters, but this is a multithreaded application with an MQTT server to pull down live info for display from a server.

hzeller commented 7 years ago

What happens if you print the size of the picture ?

As long as you make sure that setting the image only happens in one thread and never concurrently with multiple threads, you should be fine.

Also paging @chmullig , who provided the fast image implementation, to have a look (I only have limited Python experience).

hzeller commented 7 years ago

What is self.disp.canvas, did you initialize it with the result of a call to self.disp.matrix.CreateFrameCanvas() before outside of the provided code-snippet when setting up the matrix ?

(I suspect so, because otherwise Clear() would've failed before. Just double-checking)

partofthething commented 7 years ago

Yes, that is initialized within the disp class with self.canvas = self.matrix.CreateFrameCanvas(). It totally works sometimes but not others. If I just run the code twice, it works once and then crashes the second time. All the other scenes work fine, it's just the SetImage that exhibits this randomness.

Printing the size seems right: (64x32).

You know what, this is probably my fault. I was trying to disable the frame refreshing during a solid image and may have been calling SetImage in some cases before calling Clear(). It worked sometimes because my application randomly chooses scenes, and if it started with the image it was fine (because it was Cleared) but if it came to the image later it crashed. I misinterpreted this as random behavior of the library but now that I'm ensuring the clear happens, it seems pretty solid.

Yeah, and I confirmed this by forgetting to clear again. Same error. So I guess this is solved as invalid.

This library is so sweet though. I'm still super excited even though I did a false bug report. Have a good one!

partofthething commented 7 years ago

Hmmm, it may not be so simple. I'm still somewhat randomly seeming to be getting this error. Maybe I have to clear both of the double buffers?

chmullig commented 7 years ago

Sorry to hear its crashing, I'll take a look, but it might be a bit before I can really dive in.

Can you confirm which version of PIL/Pillow you have installed? Looks like python 2.7? I assume you didn't rerun the cython build, right?

For now there's an optional argument to disable the fast set image, so you should be able to get the old slow/safe behavior that way.

hzeller commented 7 years ago

If it is an intermittent problem it very much sounds like a threading issue. Did you make absolutely sure that you only access operations on a particular canvas (Clear(), SetPixel(), SetImage()) in one thread at a time ? (in contrast, access to SwapOnVSync() should be safe doing in parallel).

It sounds like you have some independently running scenes, so if you run these in multiple threads sharing one canvas variable (because it looks like you have them in the 'disp' struct which seems to be shared between Scenes?), you're in trouble.

However, if each scene owns their own canvas (separately created with CreateFrameCanvas() - you can have an arbitrary number of off-screen canvases. So create a private canvas in each Scene constructor for its own use), then you should be fine, as you only manipulate a canvas that nobody else has access to in parallel (the return-value of SwapOnVSync() will then still essentially swap them around different Scenes, but that is safe because it internally is protected by a mutex on the c++ side).

chmullig commented 7 years ago

I've tried to replicate this in Raspbian on a Raspberry Pi 3, including doing some work in other threads (fetching data, generating other images with PIL, etc). I wasn't able to replicate. Could you post more details on your setup, and if possible source code to replicate?

partofthething commented 7 years ago

It seems to have settled down with my code. I have not had a crash since my previous comment. I must have been doing something wrong. Sorry for the trouble.

Full code is now at https://github.com/partofthething/infopanel in case you want to see exactly what I was doing.