qutebrowser / qutebrowser

A keyboard-driven, vim-like browser based on Python and Qt.
https://www.qutebrowser.org/
GNU General Public License v3.0
9.65k stars 1.01k forks source link

"No such file or directory" download error when opening PDF via PDF.js #6366

Open The-Compiler opened 3 years ago

The-Compiler commented 3 years ago

As reported in #6335 by @Aster89. I can't reproduce, but might be some kind of race condition with the temporary file being deleted before it's read or something?

Relevant logs:

16:40:04 DEBUG    webelem    webelem:click:416 Clicking <qutebrowser.browser.webengine.webengineelem.WebEngineElement html='<a class="result__a" rel="noopener" href="http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf"><span class="result__type">PDF</span> Purely Functional Data Structures</a>'> with click_target ClickTarget.normal, force_event False
[...]
16:40:04 DEBUG    webview    browsertab:_set_load_status:1006 load status for <qutebrowser.browser.webengine.webenginetab.WebEngineTab tab_id=129 url='https://duckduckgo.com/?q=purely+functional+data+structures&ia=shopping&iai=r1-0&page=1&sexp=%7B%22…'>: LoadStatus.success_https
[...]
16:40:04 DEBUG    webview    browsertab:_set_load_status:1006 load status for <qutebrowser.browser.webengine.webenginetab.WebEngineTab tab_id=129 url='https://duckduckgo.com/?q=purely+functional+data+structures&ia=shopping&iai=r1-0&page=1&sexp=%7B%22…'>: LoadStatus.error
[...]
16:40:04 DEBUG    webview    browsertab:_on_navigation_request:1064 navigation request: url http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf, type Type.redirect, is_main_frame True
[...]
16:40:04 DEBUG    webview    browsertab:_set_load_status:1006 load status for <qutebrowser.browser.webengine.webenginetab.WebEngineTab tab_id=129 url='https://duckduckgo.com/?q=purely+functional+data+structures&ia=shopping&iai=r1-0&page=1&sexp=%7B%22…'>: LoadStatus.loading
[...]
16:40:04 DEBUG    downloads  downloads:_on_begin_insert_row:1034 _on_begin_insert_row with idx 0, webengine True
16:40:04 DEBUG    message    message:_log_stack:42 Stack for error message:
  File "/usr/bin/qutebrowser", line 33, in <module>
    sys.exit(load_entry_point('qutebrowser==2.1.0', 'gui_scripts', 'qutebrowser')())
  File "/usr/lib/python3.9/site-packages/qutebrowser/qutebrowser.py", line 218, in main
    return app.run(args)
  File "/usr/lib/python3.9/site-packages/qutebrowser/app.py", line 130, in run
    ret = qt_mainloop()
  File "/usr/lib/python3.9/site-packages/qutebrowser/app.py", line 140, in qt_mainloop
    return objects.qapp.exec()
  File "/usr/lib/python3.9/site-packages/qutebrowser/browser/webengine/webenginedownloads.py", line 285, in handle_download
    download.set_target(downloads.PDFJSDownloadTarget())
  File "/usr/lib/python3.9/site-packages/qutebrowser/browser/downloads.py", line 838, in set_target
    message.error(msg)
  File "/usr/lib/python3.9/site-packages/qutebrowser/utils/message.py", line 54, in error
    stack = ''.join(traceback.format_stack())
16:40:04 ERROR    message    message:error:59 Download error: [Errno 2] No such file or directory: '/tmp/qutebrowser-downloads-l8qsiq87/tmphxg3i2uk_okasaki.pdf'
[...]
16:40:04 DEBUG    downloads  webenginedownloads:_on_state_changed:64 State for <qutebrowser.browser.webengine.webenginedownloads.DownloadItem basename='okasaki.pdf'> changed to DownloadCancelled
16:40:04 DEBUG    downloads  downloads:_on_begin_remove_row:1046 _on_begin_remove_row with idx 0, webengine True
16:40:04 DEBUG    downloads  downloads:_remove_item:965 Removed download 1: okasaki.pdf [100%|?.??B]
16:40:04 DEBUG    downloads  downloads:cancel:577 cancelled
16:40:04 DEBUG    downloads  downloads:delete:596 Not deleting None
The-Compiler commented 3 years ago

So I took a first look at this, and this must be happening in TempDownloadManager.get_tmpfile. I can reproduce by doing the following:

qutebrowser itself only deletes that directory when quitting. @Aster89 do you perhaps have a script or something running cleaning those directories?

Aster89 commented 3 years ago

Emm... not that I know of. As in, I've not intentionally set any periodical cleaning of that dir (after all it is nuked at every rebook, afaik).

However, even if that directory is deleted, I see qutebrowser (after I close and reopen it when the issue occurs) has no problem in creating a new directory when I open another PDF. So why can't it just create one?

The-Compiler commented 3 years ago

So why can't it just create one?

It can, but I'd first like to understand what's happening before blindly committing a fix. It's entirely possible that my analysis is incorrect and something else causes it.

Aster89 commented 3 years ago

I agree that it might be something else causing the behavior I observe, but unfortunately I have no idea how to verify this, because I've never worried about the /tmp directory and I'm pretty sure I've not put a corn job or anything in place myself to clean it. If I have installed a program that for some reasons cleans it every now and then, that's another story, and I'd need some help to find out.

However, if you have verified that the steps you listed also trigger the same behavior, and if those steps are considered legit (especially in view of the fact that qutebrowser deletes that directory on quitting and is able to create it), then I guess handling the "directory doesn't exist upon downloading a file" case is something worth adding regardless of the other possible problem.

The-Compiler commented 3 years ago

I guess handling the "directory doesn't exist upon downloading a file" case is something worth adding regardless of the other possible problem.

Sure. But we're not in a hurry (this is clearly something exotic), and it makes it harder to verify that the bug you're seeing is actually the bug we're talking about here. If it isn't, the fix would instead hide the real bug before we had a chance to figure out what's going on.

How reliably can you reproduce this? I assume it never happens for the first PDF you open?

Could you run inotifywait --outfile downloads.log --monitor --timefmt '%T' --format '%T %w%f %e' --include 'qutebrowser-downloads-.*' /tmp in the background and show the downloads.log once this happened?

Aster89 commented 3 years ago

I assume it never happens for the first PDF you open?

I would say you're right, but I wouldn't be my life on it. I use qutebrowser all the day, but I don't open/download PDFs all the day, so can't exclude that there's been a time that I've opened the first PDF after a long time (hours, even a few days) after opening qutebrowser and the issue manifested.

However, the problem has happened again, but while the command you suggested is running in the background I see that downloads.log stays empty. Even if in the meanwhile I try opening the PDF (which fails as described).

I guess I have to try running inotifywait after starting a fresh session of qutebrowser and then inspect the log file after the issue occurs?

The-Compiler commented 3 years ago

Maybe the file only gets written once you hit ctrl-c to kill inotifywait. Alternatively, use

inotifywait --monitor --timefmt '%T' --format '%T %w%f %e' --include 'qutebrowser-downloads-.*' /tmp | tee downloads.log

so you can watch the output while it gets logged.

I guess I have to try running inotifywait after starting a fresh session of qutebrowser and then inspect the log file after the issue occurs?

Ideally, start it before starting qutebrowser, and keep it running until the issue appears.

Aster89 commented 3 years ago

After around 9 hours opening a PDF seems to be fine. Do you think suspending the system, which I do often might be somehow related? I pretty sure that I have opened a PDF after waking up the system from suspensione, some time...

Is inotifywait robust against suspension?

The-Compiler commented 3 years ago

After around 9 hours opening a PDF seems to be fine. Do you think suspending the system, which I do often might be somehow related? I pretty sure that I have opened a PDF after waking up the system from suspensione, some time...

No idea! Maybe systemd runs something cleaning up temporary directories on suspend or so.

Is inotifywait robust against suspension?

No idea either, but I'd expect it to be - try it and see!

Aster89 commented 3 years ago

Ok, the problem is happening now. The PDFs can't be opened (with the described error) and the content of the log file, at the moment of writing is:

10:05:58 /tmp/qutebrowser-downloads-wvedpqt4 CREATE,ISDIR
16:46:38 /tmp/qutebrowser-downloads-wvedpqt4 OPEN,ISDIR
16:46:38 /tmp/qutebrowser-downloads-wvedpqt4 ACCESS,ISDIR
16:46:38 /tmp/qutebrowser-downloads-wvedpqt4 ACCESS,ISDIR
16:46:38 /tmp/qutebrowser-downloads-wvedpqt4 CLOSE_NOWRITE,CLOSE,ISDIR
08:17:14 /tmp/qutebrowser-downloads-wvedpqt4 OPEN,ISDIR
08:17:14 /tmp/qutebrowser-downloads-wvedpqt4 ACCESS,ISDIR
08:17:14 /tmp/qutebrowser-downloads-wvedpqt4 ACCESS,ISDIR
08:17:14 /tmp/qutebrowser-downloads-wvedpqt4 CLOSE_NOWRITE,CLOSE,ISDIR
12:56:00 /tmp/qutebrowser-downloads-wvedpqt4 OPEN,ISDIR
12:56:00 /tmp/qutebrowser-downloads-wvedpqt4 ACCESS,ISDIR
12:56:00 /tmp/qutebrowser-downloads-wvedpqt4 CLOSE_NOWRITE,CLOSE,ISDIR
12:56:00 /tmp/qutebrowser-downloads-wvedpqt4 DELETE,ISDIR

I know that I triggered the lines 16:46:38 and 08:17:14 because I lsed into /tmp/qutebrowser-downloads-wvedpqt4, but I haven't done it at 12:56:00 (which is a few minutes ago, at the time of writing).

The directory is gone, which I think is expected, given the DELETE in the last line of the log.

Does this help us? (I think I can kill the inotifywait now, right?)

The-Compiler commented 3 years ago

So yes, this confirms indeed that something is deleting the directory. I'll push a fix so that qutebrowser recreates it, but you might want to find out what's randomly deleting things on your system (pretty sure it's not qutebrowser itself). No idea how, though.

The-Compiler commented 3 years ago

I tried writing a fix:

commit 01ca52f97978515ad42e4b348e8ee2a575ef143e
Author: Florian Bruhin <me@the-compiler.org>
Date:   Tue Apr 13 16:48:57 2021 +0200

    Recreate temporary download directory if it vanishes

    Fixes #6366

diff --git a/qutebrowser/browser/downloads.py b/qutebrowser/browser/downloads.py
index fa4ac9da2..ebc1109e8 100644
--- a/qutebrowser/browser/downloads.py
+++ b/qutebrowser/browser/downloads.py
@@ -1319,9 +1319,15 @@ def get_tmpdir(self):
         Return:
             The tempfile.TemporaryDirectory that is used.
         """
+        if self._tmpdir is not None and not os.path.exists(self._tmpdir.name):
+            log.misc.error(
+                f"Temporary directory {self._tmpdir.name} vanished, recreating...")
+            self.cleanup()
+
         if self._tmpdir is None:
             self._tmpdir = tempfile.TemporaryDirectory(
                 prefix='qutebrowser-downloads-')
+
         return self._tmpdir

     def get_tmpfile(self, suggested_name):
diff --git a/tests/unit/browser/test_downloads.py b/tests/unit/browser/test_downloads.py
index 16dee0889..1fd8d2b57 100644
--- a/tests/unit/browser/test_downloads.py
+++ b/tests/unit/browser/test_downloads.py
@@ -17,6 +17,9 @@
 # You should have received a copy of the GNU General Public License
 # along with qutebrowser.  If not, see <https://www.gnu.org/licenses/>.

+import logging
+import pathlib
+
 import pytest

 from qutebrowser.browser import downloads, qtnetworkdownloads
@@ -175,3 +178,41 @@ def patched_cancel(remove_data=True):
         monkeypatch.setattr(item1, '_after_set_filename', lambda: None)
         item1._cancel_conflicting_download()
         assert item2.done
+
+
+class TestTempDownloadManager:
+
+    @pytest.fixture
+    def manager(self):
+        man = downloads.TempDownloadManager()
+        yield man
+        man.cleanup()
+
+    def test_get_tmpdir(self, manager):
+        assert manager._tmpdir is None
+        dir1 = manager.get_tmpdir()
+        dir2 = manager.get_tmpdir()
+        assert dir1 == dir2
+        assert pathlib.Path(dir1.name).exists()
+
+    def test_get_tmpfile(self, manager):
+        tmpfile = pathlib.Path(manager.get_tmpfile('suggested').name)
+        tmpdir = pathlib.Path(manager.get_tmpdir().name)
+        assert tmpfile.parent == tmpdir
+        assert tmpfile.stem.endswith('_suggested')
+
+    def test_download_directory_vanished(self, manager, caplog):
+        """See https://github.com/qutebrowser/qutebrowser/issues/6366."""
+        tmpdir = pathlib.Path(manager.get_tmpdir().name)
+        tmpfile = pathlib.Path(manager.get_tmpfile('suggested1').name)
+
+        tmpfile.unlink()
+        tmpdir.rmdir()
+
+        with caplog.at_level(logging.ERROR):
+            manager.get_tmpfile('suggested2')
+
+        assert len(caplog.messages) == 1
+        message = caplog.messages[0]
+        assert message.startswith('Temporary directory ')
+        assert message.endswith('vanished, recreating...')
commit 7994c89369babdf8b6186de35ce8016c65fb66a1
Author: Florian Bruhin <me@the-compiler.org>
Date:   Tue Apr 13 17:19:38 2021 +0200

    Fix issues with TempDownloadManager tests

diff --git a/qutebrowser/browser/downloads.py b/qutebrowser/browser/downloads.py
index ebc1109e8..0363a191c 100644
--- a/qutebrowser/browser/downloads.py
+++ b/qutebrowser/browser/downloads.py
@@ -1301,14 +1301,17 @@ def __init__(self):
         self.files: MutableSequence[IO[bytes]] = []
         self._tmpdir = None

-    def cleanup(self):
+    def cleanup(self, *, ignore_errors=False):
         """Clean up any temporary files."""
         if self._tmpdir is not None:
             try:
                 self._tmpdir.cleanup()
             except OSError:
-                log.misc.exception("Failed to clean up temporary download "
-                                   "directory")
+                msg = "Failed to clean up temporary download directory"
+                if ignore_errors:
+                    log.misc.debug(msg, exc_info=True)
+                else:
+                    log.misc.exception(msg)
             self._tmpdir = None

     def get_tmpdir(self):
@@ -1322,7 +1325,7 @@ def get_tmpdir(self):
         if self._tmpdir is not None and not os.path.exists(self._tmpdir.name):
             log.misc.error(
                 f"Temporary directory {self._tmpdir.name} vanished, recreating...")
-            self.cleanup()
+            self.cleanup(ignore_errors=True)

         if self._tmpdir is None:
             self._tmpdir = tempfile.TemporaryDirectory(
diff --git a/tests/unit/browser/test_downloads.py b/tests/unit/browser/test_downloads.py
index 1fd8d2b57..e85c75547 100644
--- a/tests/unit/browser/test_downloads.py
+++ b/tests/unit/browser/test_downloads.py
@@ -186,7 +186,7 @@ class TestTempDownloadManager:
     def manager(self):
         man = downloads.TempDownloadManager()
         yield man
-        man.cleanup()
+        man.cleanup(ignore_errors=True)

     def test_get_tmpdir(self, manager):
         assert manager._tmpdir is None

but something is still failing on Windows (not sure why cleanup fails at all there). Will need to take a closer look first, but it might take me a while, given that this is clearly something quite exotic.