dkogan / mrcal

Next-generation camera-modeling toolkit
http://mrcal.secretsauce.net
Apache License 2.0
190 stars 15 forks source link

mrcal-show-residuals and friends hang/don't show plots #13

Closed mcm001 closed 9 months ago

mcm001 commented 9 months ago

OS: Ubuntu 22.04 Compositor: X11 mrcal version: 2.3-1jammy1

Script output:

(venv) matt@photonvision:~/Documents/GitHub/photonvision/photon_calibration_Microsoft_LifeCam_HD-3000_800x600$ mrcal-show-residuals      --histogram             --set 'xrange [-2:2]'   --unset key             --binwidth 0.1          camera-0.cameramodel
Traceback (most recent call last):
  File "/usr/bin/mrcal-show-residuals", line 220, in <module>
    mrcal.show_residuals_histogram(optimization_inputs,
  File "/usr/lib/python3/dist-packages/mrcal/visualization.py", line 3166, in show_residuals_histogram
    plot.plot(*data_tuples)
  File "/usr/lib/python3/dist-packages/gnuplotlib.py", line 2558, in plot
    plot_process_footer()
  File "/usr/lib/python3/dist-packages/gnuplotlib.py", line 2391, in plot_process_footer
    self._checkpoint('printwarnings')
  File "/usr/lib/python3/dist-packages/gnuplotlib.py", line 1718, in _checkpoint
    raise GnuplotlibError(
gnuplotlib.GnuplotlibError: Gnuplot process no longer responding. This shouldn't happen... Is your X connection working?

I've installed nothing beyond mrcal itself afaik. Naively running the gnuplot command does work though (see below). image

I've attached a zip of my calibration data below as well. It's not great calibration data I just wanted to try some python stuff out, so the intrinsics don't really make sense. photon_calibration_Microsoft_LifeCam_HD-3000_800x600.zip

dkogan commented 9 months ago

Yes. Several people have reported this issue, which boils down to "gnuplotlib uses select(), which doesn't work on Windows". Nobody has offered to fix it, however so it remains unfixed. If you care to look at it, I can't imagine it would be all that much work to fix. There's an attempt here: https://github.com/dkogan/gnuplotlib/pull/15 although I have no idea how "done" it is, or how sound the approach is.

mcm001 commented 9 months ago

Hi! I'll give that PR a shot, but just wanted to confirm you would expect this to happen on Ubuntu? This is on my laptop running Ubuntu 22.04 (native, not in a VM or WSL or anything)

dkogan commented 9 months ago

Oh. Yeah, I was definitely thinking about Windows, as you surmised. It should all work on machines with the Linux kernel. We can see what's doing on. First, try a vanilla gnuplotlib program:

import numpy      as np
import gnuplotlib as gp

x = np.arange(101) - 50
gp.plot(x**2, wait=True)

Presumably this will fail in the same way. Then let's ask to see why it failed:

import numpy      as np
import gnuplotlib as gp

x = np.arange(101) - 50
gp.plot(x**2, wait=True, log=True)

This will make a communications log, and you'll be able to see why it failed. Post the log here, if the cause of the failure isn't clear.

mcm001 commented 9 months ago

Yep can confirm the first snippet fails in the same way:

Traceback (most recent call last):
  File "/home/matt/Documents/GitHub/photonvision/photon-lib/py/photonlibpy/gnuplottest.py", line 5, in <module>
    gp.plot(x**2, wait=True)
  File "/usr/lib/python3/dist-packages/gnuplotlib.py", line 2660, in plot
    globalplot.plot(*curves)
  File "/usr/lib/python3/dist-packages/gnuplotlib.py", line 2558, in plot
    plot_process_footer()
  File "/usr/lib/python3/dist-packages/gnuplotlib.py", line 2391, in plot_process_footer
    self._checkpoint('printwarnings')
  File "/usr/lib/python3/dist-packages/gnuplotlib.py", line 1718, in _checkpoint
    raise GnuplotlibError(
gnuplotlib.GnuplotlibError: Gnuplot process no longer responding. This shouldn't happen... Is your X connection working?

And output of the second snippet is attached as a file below. Nothing jumps out to me though?

out.txt

dkogan commented 9 months ago

Thanks. I'm a bit mystified how your gnuplot test worked at all. If I combine all the single characters from your log (latest gnuplotlib release does this already) the punchline is:

/usr/lib/gnuplot/gnuplot_qt: symbol lookup error: /snap/core20/current/lib/x86_64-linux-gnu/libpthread.so.0: undefined symbol: __libc_pthread_init, version GLIBC_PRIVATE

So you have some snap that provides its own incompatible libpthread, or something. Strong recommendation is to get rid of snaps: uninstall snap*, kill the snapd daemon and delete /snap. You can also simply use Debian instead of Ubuntu, since Ubuntu quite literally is Debian + extra crap that nobody understands that sometimes breaks things.

If you want to debug a bit more deeply, you can probably reproduce this failure by trying your simple gnuplot test inside Python. What if you run

import os
os.system("gnuplot -p -e 'p sin(x)'")
mcm001 commented 9 months ago

If I run that, I get this output:

pmatt@photonvision:~$ python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.system("gnuplot -p -e 'p sin(x)'")
sh: 1: gnuplot: not found
32512

And yeah I know ubuntu snaps bad. This is getting close to the last straw for Ubuntu for me; getting close to jumping ship to Debian.

mcm001 commented 9 months ago

Oh I'm stupid! I"d uninstalled mrcal, which took gnuplot with it. With that fixed, I see a graph + this output:

matt@photonvision:~$ python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.system("gnuplot -p -e 'p sin(x)'")
0
>>> 

image

mcm001 commented 9 months ago

After manually installing gnuplot with apt, and then mrcal with apt, I'm no longer able to recreate this issue? Happy to chalk this up to "ubuntu snaps are stupid, don't use ubuntu" for now. I'll reopen this if it happens again tho!

image

dkogan commented 9 months ago

Great. If this issue comes back, we can debug more deeply to pinpoint the problem more precisely. If you want to. In the meantime, I'm going to close this. Feel free to reopen.