texstudio-org / texstudio

TeXstudio is a fully featured LaTeX editor. Our goal is to make writing LaTeX documents as easy and comfortable as possible.
http://www.texstudio.org/
GNU General Public License v3.0
2.79k stars 344 forks source link

Frequent crashes with `Assertion 'numsize == 1 && n < d' failed` in libpoppler #1409

Closed astoeckel closed 3 years ago

astoeckel commented 3 years ago

Environment

Expected behavior

TexStudio should not crash.

Actual behavior

Since ~an update some time during the Fedora 32 cycle~ upgrading to my AMD Threadripper workstation due to WFH (I think about five to six months ago), TeXstudio started crashing frequently (about every five to fifteen minutes) with SIGABRT due to Assertion 'numsize == 1 && n < d' failed. Crashes are intermittent whenever an updated PDF is loaded in the integrated document viewer (e.g., after recompiling the document).

The crash is caused by passing an empty string into atof in line 349 of fofi/FoFiType1.cc in Poppler (fontMatrix[j] = atof(p);).

This is an upstream bug in Poppler.

~I guess that this is most likely an upstream bug, maybe caused by Poppler parsing an incomplete document. While this should be fixed upstream, maybe TeXstudio can work around this by only loading updated files into Poppler once the generated PDF file has been finalised (i.e., only once the build process has finished).~

How to reproduce

~Unclear.~ Cannot reproduce on my ancient laptop running Fedora 33. Cannot reproduce when running TeXstudio via X11 forwarding over LAN (which is how I predominantly use TeXstudio). In other words, the slower the system, the less likely this bug is being triggered. This suggests that this is a timing related bug. Crashes seem to be more frequent in more complex documents that include at least one matplotlib-exported PDF.

Could be the same bug as reported in TeXstudio SourceForge bug 2387 (https://sourceforge.net/p/texstudio/bugs/2387/)

Update:
See https://github.com/texstudio-org/texstudio/issues/1409#issuecomment-753247183 for a more reliable method for reproducing the bug. Reproducing the bug may require a high core count CPU.

Update 2: See https://github.com/astoeckel/texstudio_poppler_crash_bug/ for test cases.

Let me know if you need more specific information, I'll see what I can gather.

sunderme commented 3 years ago

txs has already mechanisms to avoid loading incomplete pdf. Without test case, there is nothing we can do.

astoeckel commented 3 years ago

Thank you for your response! I'll try to compile TeXstudio and libpoppler myself and will see whether I can reliably reproduce and/or fix the problem.

astoeckel commented 3 years ago

Quick update. First of all, I can reproduce the bug on my machine with the latest TeXstudio master branch (1b62ecb37a38eb9ac9e4e8e8bdbcafc65e7f8b73) and a locally compiled version of poppler (release 20.12.1).

Of course, this seems to be somewhat of a heisenbug: when running in gdb with debug information I rarely get crashes, without gdb I sometimes get crashes, even during the initial load of the PDF on startup.

The latter observation suggests that the bug has nothing to do with the compilation process, but rather is a race condition within the rendering thread itself.

Full backtrace:

texstudio: strtod_l.c:1497: ____strtod_l_internal: Assertion `numsize == 1 && n < d' failed.
--Type <RET> for more, q to quit, c to continue without paging--

Thread 2434 "PDFRenderEngine" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffece7fc640 (LWP 246097)]
0x00007ffff609b9d5 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff609b9d5 in raise () at /lib64/libc.so.6
#1  0x00007ffff60848a4 in abort () at /lib64/libc.so.6
#2  0x00007ffff6084789 in _nl_load_domain.cold () at /lib64/libc.so.6
#3  0x00007ffff6094026 in annobin_assert.c_end () at /lib64/libc.so.6
#4  0x00007ffff60a5bc7 in ____strtod_l_internal () at /lib64/libc.so.6
#5  0x00007ffff5d21bb9 in atof (__nptr=0x7fffcd7f8acc "\361\331\035\353\r") at /usr/include/bits/stdlib-float.h:27
#6  FoFiType1::parse() (this=0x7ffe343890c0) at /home/andreas/source/poppler-20.12.1/fofi/FoFiType1.cc:339
#7  0x00007ffff5d21f35 in FoFiType1::getName() (this=this@entry=0x7ffe343890c0) at /home/andreas/source/poppler-20.12.1/fofi/FoFiType1.cc:92
#8  0x00007ffff5d90faa in Gfx8BitFont::Gfx8BitFont(XRef*, char const*, Ref, GooString*, GfxFontType, Ref, Dict*)
    (this=0x7ffe340065f0, xref=<optimized out>, tagA=<optimized out>, idA=..., nameA=<optimized out>, typeA=<optimized out>, embFontIDA=..., fontDict=0x7ffe342f8010)
    at /home/andreas/source/poppler-20.12.1/poppler/GfxFont.cc:1157
#9  0x00007ffff5d94d9f in GfxFont::makeFont(XRef*, char const*, Ref, Dict*) (xref=xref@entry=0x7ffe34385020, tagA=0x7ffe3402a3b0 "F8", idA=..., idA@entry=..., fontDict=0x7ffe342f8010)
    at /home/andreas/source/poppler-20.12.1/poppler/GfxFont.cc:214
#10 0x00007ffff5d94fde in GfxFontDict::GfxFontDict(XRef*, Ref*, Dict*) (this=this@entry=0x461cd70, xref=0x7ffe34385020, fontDictRef=fontDictRef@entry=0x0, fontDict=0x461ccb0)
    at /home/andreas/source/poppler-20.12.1/poppler/GfxFont.cc:2378
#11 0x00007ffff5d75913 in GfxResources::GfxResources(XRef*, Dict*, GfxResources*) (this=0x7ffe343db2b0, xrefA=<optimized out>, resDictA=0x7ffece7fb290, nextA=0x0)
    at /home/andreas/source/poppler-20.12.1/poppler/Gfx.cc:255
#12 0x00007ffff5d815d9 in Gfx::Gfx(PDFDoc*, OutputDev*, int, Dict*, double, double, PDFRectangle const*, PDFRectangle const*, int, bool (*)(void*), void*, XRef*)
    (this=this@entry=0x7ffe343f9980, docA=<optimized out>, outA=outA@entry=0x7ffece7fb600, pageNum=17, resDict=0x7ffe343416f0, hDPI=hDPI@entry=72, vDPI=vDPI@entry=72, box=box@entry=0x7ffece7fb3c0, cropBox=0x0, rotate=0, abortCheckCbkA=0x0, abortCheckCbkDataA=0x7ffece7fb780, xrefA=0x7ffe34385020) at /home/andreas/source/poppler-20.12.1/poppler/Gfx.cc:469
#13 0x00007ffff5dd951d in Page::createGfx(OutputDev*, double, double, int, bool, bool, int, int, int, int, bool, bool (*)(void*), void*, XRef*)
    (this=this@entry=0x4bafca0, out=out@entry=0x7ffece7fb600, hDPI=72, vDPI=72, rotate=0, useMediaBox=<optimized out>, crop=<optimized out>, sliceX=<optimized out>, sliceY=-1, sliceW=-1, sliceH=-1, printing=false, abortCheckCbk=0x0, abortCheckCbkData=0x7ffece7fb780, xrefA=0x7ffe34385020) at /home/andreas/source/poppler-20.12.1/poppler/Page.cc:550
#14 0x00007ffff5dd9fbc in Page::displaySlice(OutputDev*, double, double, int, bool, bool, int, int, int, int, bool, bool (*)(void*), void*, bool (*)(Annot*, void*), void*, bool)
    (this=0x4bafca0, out=0x7ffece7fb600, hDPI=<optimized out>, vDPI=<optimized out>, rotate=<optimized out>, useMediaBox=<optimized out>, crop=<optimized out>, sliceX=<optimized out>, sliceY=-1, sliceW=-1, sliceH=-1, printing=false, abortCheckCbk=0x0, abortCheckCbkData=0x7ffece7fb780, annotDisplayDecideCbk=0x0, annotDisplayDecideCbkData=0x0, copyXRef=true)
    at /home/andreas/source/poppler-20.12.1/poppler/Page.cc:571
#15 0x00007ffff5ddfa96 in PDFDoc::displayPageSlice(OutputDev*, int, double, double, int, bool, bool, bool, int, int, int, int, bool (*)(void*), void*, bool (*)(Annot*, void*), void*, bool)
    (this=0x4c4f690, out=<optimized out>, page=<optimized out>, hDPI=3.9210725524632019e-316, vDPI=72, rotate=<optimized out>, useMediaBox=<optimized out>, crop=<optimized out>, printing=<optimized out>, sliceX=<optimized out>, sliceY=<optimized out>, sliceW=<optimized out>, sliceH=<optimized out>, abortCheckCbk=<optimized out>, abortCheckCbkData=<optimized out>, annotDisplayDecideCbk=<optimized out>, annotDisplayDecideCbkData=<optimized out>, copyXRef=true) at /home/andreas/source/poppler-20.12.1/poppler/PDFDoc.cc:658
#16 0x00007ffff7e00faa in Poppler::Page::renderToImage(double, double, int, int, int, int, Poppler::Page::Rotation, void (*)(QImage const&, QVariant const&), bool (*)(QVariant const&), bool (*)(QVariant const&), QVariant const&) const
    (this=this@entry=0x7ffe343b40a0, xres=72, yres=72, xPos=-1, yPos=-1, w=-1, h=-1, rotate=Poppler::Page::Rotate0, partialUpdateCallback=0x0, shouldDoPartialUpdateCallback=0x0, shouldAbortRenderCallback=0x0, payload=...) at /home/andreas/source/poppler-20.12.1/qt5/src/poppler-page.cc:537
#17 0x00007ffff7e01641 in Poppler::Page::renderToImage(double, double, int, int, int, int, Poppler::Page::Rotation, void (*)(QImage const&, QVariant const&), bool (*)(QVariant const&), QVariant const&) const
    (this=this@entry=0x7ffe343b40a0, xres=<optimized out>, yres=<optimized out>, x=<optimized out>, y=<optimized out>, w=<optimized out>, h=<optimized out>, rotate=Poppler::Page::Rotate0, partialUpdateCallback=0x0, shouldDoPartialUpdateCallback=0x0, payload=...) at /home/andreas/source/poppler-20.12.1/qt5/src/poppler-page.cc:459
#18 0x00007ffff7e0167e in Poppler::Page::renderToImage(double, double, int, int, int, int, Poppler::Page::Rotation) const
    (this=this@entry=0x7ffe343b40a0, xres=<optimized out>, yres=<optimized out>, x=<optimized out>, y=<optimized out>, w=<optimized out>, h=<optimized out>, rotate=Poppler::Page::Rotate0)
    at /home/andreas/source/poppler-20.12.1/qt5/src/poppler-page.cc:453
#19 0x00000000009266e7 in PDFRenderEngine::run() (this=0x4e077d0) at src/pdfviewer/pdfrenderengine.cpp:87
#20 0x00007ffff6703df0 in QThreadPrivate::start(void*) () at /lib64/libQt5Core.so.5
#21 0x00007ffff657b3f9 in start_thread () at /lib64/libpthread.so.0
#22 0x00007ffff615f903 in clone () at /lib64/libc.so.6
astoeckel commented 3 years ago

I've modified TeXstudio to perform the PDF page rendering in a dense loop by deactivating the renderedPages cache. I can now reproduce the bug quite quickly, even when running in GDB. I simply open one of my PDFs in pdf-viewer-only mode and zoom in and out (causing the 20 pages to be rendered in parallel), while scrolling up and down at the same time.

One factor leading to frequent crashes on my machine may be the relatively high hardware thread count (32), which probably leads to a larger ~Qt~ thread pool used for rendering, thus making threading issues more visible.

I had a look at the PDF rendering code, but the object lifetimes and ownerships are not very clear to me, and the strewn in magic Qt macros such as emit do not help either.

Command Line:

gdb texstudio --ex "run --no-session --pdf-viewer-only <DOCUMENT>"

Demonstration Video (best open the URL in VLC):
https://somweyr.de/stuff/texstudio_crash_screencast.webm

Document: http://compneuro.uwaterloo.ca/files/publications/stoeckel.2020c.pdf

Diff:

diff --git a/src/pdfviewer/pdfrenderengine.cpp b/src/pdfviewer/pdfrenderengine.cpp
index 22d6eb4c..74870afd 100644
--- a/src/pdfviewer/pdfrenderengine.cpp
+++ b/src/pdfviewer/pdfrenderengine.cpp
@@ -83,6 +83,7 @@ void PDFRenderEngine::run()
                if (!document.isNull() && command.pageNr >= 0 && command.pageNr < cachedNumPages) {
                        Poppler::Page *page = document->page(command.pageNr);
                        if (page) {
+                               qWarning() << this << " Thread: " << QThread::currentThreadId() << " Render page begin " << command.pageNr << " at " << command.ticket << priorityThread << "x/y" << command.x << command.y << " res "<<command.xres << ", " << command.w << command.h;
                                QImage image = page->renderToImage(command.xres, command.yres,
                                                                   command.x, command.y, command.w, command.h, command.rotate);
                                QSizeF pageSize = page->pageSizeF();
@@ -105,10 +106,12 @@ void PDFRenderEngine::run()
                     }

                                delete page;
-                               if (!queue->stopped) //qDebug() << command.ticket << " send from "<<QThread::currentThreadId(),
+                               if (!queue->stopped) {
+                                       qWarning() << this << " Thread: " << QThread::currentThreadId() << " Render page emit " << command.pageNr << " at " << command.ticket << priorityThread << "x/y" << command.x << command.y << " res "<<command.xres << ", " << command.w << command.h;
                                        emit sendImage(image, command.pageNr, command.ticket);
+                               }
                        }
-                       //qDebug() << this << " Render page " << command.pageNr << " at " << command.ticket << priorityThread << "x/y" << command.x << command.y << " res "<<command.xres << ", " << command.w << command.h;
+                       qWarning() << this << " Thread: " << QThread::currentThreadId() << " Render page done " << command.pageNr << " at " << command.ticket << priorityThread << "x/y" << command.x << command.y << " res "<<command.xres << ", " << command.w << command.h;
                }
        }
        //queue->deref();
diff --git a/src/pdfviewer/pdfrendermanager.cpp b/src/pdfviewer/pdfrendermanager.cpp
index 21b84db6..9409da90 100644
--- a/src/pdfviewer/pdfrendermanager.cpp
+++ b/src/pdfviewer/pdfrendermanager.cpp
@@ -251,7 +251,7 @@ QPixmap PDFRenderManager::renderToImage(int pageNr, QObject *obj, const char *re
        CachePixmap img;
        qreal scale = 10;
        bool partialImage = false;
-       if (renderedPages.contains(pageNr + kMaxPageZoom) && xres > kMaxDpiForFullPage) { // try cache first
+       /*if (renderedPages.contains(pageNr + kMaxPageZoom) && xres > kMaxDpiForFullPage) { // try cache first
                CachePixmap *cachedPix = renderedPages[pageNr + kMaxPageZoom];
                if (cachedPix->getCoord() == QPoint(x, y) && cachedPix->getRes() < 1.01 * xres && cachedPix->getRes() > 0.99 * xres && cachedPix->width() == w && cachedPix->height() == h) {
                        img = *cachedPix;
@@ -261,7 +261,7 @@ QPixmap PDFRenderManager::renderToImage(int pageNr, QObject *obj, const char *re
        }
        if (img.isNull() && renderedPages.contains(pageNr)) { // try cache first
                img = *renderedPages[pageNr];
-       }
+       }*/

     // delayTimeout = -1 means it's NOT been called by delayedUpdate
     // delayTimeout >= 0 means it's been called called by delayedUpdate and delayedUpdate wants to force an update after delayTimeout
diff --git a/texstudio.pro b/texstudio.pro
index 72a51bf6..e6719e35 100644
--- a/texstudio.pro
+++ b/texstudio.pro
@@ -441,3 +441,6 @@ exists(./.git)  {
 *-g++:equals(QT_MAJOR_VERSION, 5):lessThan(QT_MINOR_VERSION, 13) {
   QMAKE_CXXFLAGS += -Wno-deprecated-copy
 }
+
+QMAKE_CFLAGS_RELEASE += -g
+QMAKE_CXXFLAGS_RELEASE += -g
astoeckel commented 3 years ago

Some more evidence that this is most likely a threading issue. I am not able to evoke any crashes (for now) when setting the ~global~ PDFRenderManager thread pool size to one.

~This workaround seems to work great for me for now:~

diff --git a/src/main.cpp b/src/main.cpp
index 5c807cc3..9100d08e 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -189,6 +189,9 @@ int main(int argc, char **argv)
 #if QT_VERSION >= QT_VERSION_CHECK(5,6,0)
        QApplication::setAttribute(Qt::AA_DisableHighDpiScaling);
 #endif
+
+       QThreadPool::globalInstance()->setMaxThreadCount(1);
+
        // This is a dummy constructor so that the programs loads fast.
        TexstudioApp a(appId, argc, argv);
        bool startAlways = false;

Edit: Scrap that, TeXstudio just crashed with the same error and only a single thread pool thread.

Edit 2: Now trying the following patch (in addition to the above). I didn't notice that PDFRenderEngine was derived from SafeThread and thus PDFRenderManager did not implicitly use QThreadPool

--- a/src/pdfviewer/PDFDocument.cpp
+++ b/src/pdfviewer/PDFDocument.cpp
@@ -3074,7 +3074,7 @@ retryNow:
         renderManager = nullptr;
        }

-       renderManager = new PDFRenderManager(this);
+       renderManager = new PDFRenderManager(this, 1);
        renderManager->setCacheSize(globalConfig->cacheSizeMB);
        renderManager->setLoadStrategy(int(globalConfig->loadStrategy));
        PDFRenderManager::Error error = PDFRenderManager::NoError;

Edit 3: I've used TeXstudio with the last patch for a while now and didn't have any crashes.

Edit 4: Used TeXstudio the entire day and still no crashes. The above workarounds are fine for me; rendering the PDFs is of course slower now, but not unbearably so.

Myridium commented 3 years ago

I am having this issue constantly and it is making the program difficult to use. Mine crashes maybe ~30% of the time I hit compile & view, with the same error. I also have a 16 core, 32 thread CPU. I do not have this issue on my 2-core machine. Using texstudio 3.0.4. I am working on a ~100 page document.

To clarify, what is the diff you made in your previous comment? Was that just for debugging purposes?

Myridium commented 3 years ago

@astoeckel thanks for your investigation. I find that your patch resolves the issue on my machine. I have created a pull request for this hotix at #1463 .

Strangely, I do not notice any increase in render time. It may even be a little faster...? I wonder whether the threads were even divvying up the work properly.

astoeckel commented 3 years ago

Good to see that I'm not the only person running into this problem! PDF rendering is more than fast enough for me as well with a single thread, despite me being on a first generation Threadripper CPU, which has a relatively low single thread performance.

To clarify things a little for other people stumbling over this issue, the long patch here was only for debugging. The actual patch is the second single-line patch in this comment

astoeckel commented 3 years ago

See #1498 for a PR implementing a stress test that evokes this bug.

astoeckel commented 3 years ago

See https://github.com/astoeckel/texstudio_poppler_crash_bug for a stand-alone program reproducing this bug.

astoeckel commented 3 years ago

I'm now 100% convinced that this a an upstream bug in Poppler. I suggest leaving this issue open until upstream has fixed the bug.

astoeckel commented 3 years ago

Fixed upstream: https://gitlab.freedesktop.org/poppler/poppler/-/commit/31734dcb365e2b69d88c86ddaee3bd62af97fade

ghost commented 3 years ago

@astoeckel, I'm curious. Does poppler-qt5's stress test for mutlithreaded rendering crash on your machine prior to the fix? if not, their test suite is seriously broken.

https://cgit.freedesktop.org/poppler/poppler/tree/qt5/tests/stress-threads-qt5.cpp

astoeckel commented 3 years ago

Good call! stress-threads-qt5 does indeed crash, and does so almost immediately. So their test case is fine; it is probably the combination of the specific PDF and the high core count CPU causing the issue.

My command line:

./stress-threads-qt5 10 16 16 stoeckel.2020c.pdf
sunderme commented 3 years ago

Thanks to Andreas for providing the test case and communicating with poppler.

sunderme commented 3 years ago

by the way, ./stress-threads-qt5 10 16 16 stoeckel.2020c.pdf fails on my machine as well.