freedomofpress / dangerzone

Take potentially dangerous PDFs, office documents, or images and convert them to safe PDFs
https://dangerzone.rocks/
GNU Affero General Public License v3.0
3.49k stars 163 forks source link

Crash on Ubuntu 22.04 with a PDF #620

Open j75 opened 10 months ago

j75 commented 10 months ago

Hello,

Running the application on Ubuntu 22.04 does not work, the error messages are

[INFO] > /usr/bin/podman run --network none -u dangerzone --security-opt no-new-privileges --userns keep-id --cap-drop all --rm -v /tmp/user/1602/tmppynjffwh/unsafe/input_file:/tmp/input_file:Z -v /tmp/user/1602/tmppynjffwh/pixels:/tmp/dangerzone:Z -e ENABLE_TIMEOUTS=1 dangerzone.rocks/dangerzone /usr/bin/python3 -m dangerzone.conversion.doc_to_pixels
[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED> time="2023-11-13T10:21:42+01:00" level=error msg="User-selected graph driver \"overlay\" overwritten by graph driver \"vfs\" from database - delete libpod local files to resolve"

[INFO] [doc UyJ3P6] 3% UNTRUSTED> Calculating number of pages
[INFO] [doc UyJ3P6] 3% UNTRUSTED> Converting page 1/50 to pixels
...
[INFO] [doc UyJ3P6] 47% UNTRUSTED> Converting page 50/50 to pixels
[ERROR] [doc UyJ3P6] 47% UNTRUSTED> [Errno 13] Permission denied: '/tmp/dangerzone/page-9.rgb'
[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED> Traceback (most recent call last):

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "<frozen runpy>", line 198, in _run_module_as_main

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "<frozen runpy>", line 88, in _run_code

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/opt/dangerzone/dangerzone/conversion/doc_to_pixels.py", line 424, in <module>

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>     sys.exit(asyncio.run(main()))

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>              ^^^^^^^^^^^^^^^^^^^

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>     return runner.run(main)

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>            ^^^^^^^^^^^^^^^^

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>     return self._loop.run_until_complete(task)

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>     return future.result()

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>            ^^^^^^^^^^^^^^^

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/opt/dangerzone/dangerzone/conversion/doc_to_pixels.py", line 418, in main

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>     with open("/tmp/dangerzone/captured_output.txt", "wb") as container_log:

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED>          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[ERROR] [doc UyJ3P6] -1% Invalid JSON returned from container:

    UNTRUSTED> PermissionError: [Errno 13] Permission denied: '/tmp/dangerzone/captured_output.txt'

[ERROR] documents-to-pixels failed
[ERROR] An exception occurred while converting document 'UyJ3P6'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/base.py", line 39, in convert
    success = self._convert(document, ocr_lang)
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/container.py", line 257, in _convert
    return self._convert_with_tmpdirs(
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/container.py", line 311, in _convert_with_tmpdirs
    raise exception_from_error_code(ret)  # type: ignore [misc]
  File "/usr/lib/python3/dist-packages/dangerzone/conversion/errors.py", line 126, in exception_from_error_code
    raise ValueError(f"Unknown error code '{error_code}'")
ValueError: Unknown error code '1'
[ERROR] [doc UyJ3P6] 0% Unknown error code '1'
[DEBUG] Marking doc UyJ3P6 as 'failed'
j75 commented 10 months ago

I've tried with other documents too, and with the CLI application - the result is the same, it never works! The folder /tmp/dangerzone/ does not exist, maybe it is created by the application? Anyhow, creating it before the application starts does not change anything.

deeplow commented 10 months ago

Thanks for reporting this. Just to be clear, you're trying to run Dangerzone from the source code, correct? (I see some [DEBUG] messages that seem to indicate that).

If that's the case and you wish to install Dangerzone as a regular user, which should work, then the instructions are available here.

We can't always guarantee the latest development version will work out of the box as we haven't done the extensive quality assurance (QA) procedures as we do with our releases.

Regardless, I can help you address said issue in the development version.

j75 commented 10 months ago

No, I have just installed the debian package:

% aptitude show dangerzone 
Package: dangerzone                      
Version: 0.5.0-1
New: yes
State: installed
Automatically installed: no
Priority: optional
Section: python
Maintainer: Freedom of the Press Foundation <info@freedom.press>
Architecture: all
...
deeplow commented 10 months ago

OK. That's a bit more concerning. And thanks for reporting. I'll look into it.

apyrgio commented 10 months ago

@j75 The "Permission denied" errors may hint at SELinux/AppArmor being enforced in your system. In particular, rootless containers may not work properly with AppArmor, if I read this issue correctly: https://github.com/containers/podman/pull/19303

So, can you please run the following commands to find out if AppArmor/SELinux is enabled in your system?

sudo aa-status
getenforce
j75 commented 10 months ago

getenforce shows Disabled but sudo aa-status shows a lot of stuff:

apparmor module is loaded.
85 profiles are loaded.
72 profiles are in enforce mode.
...
j75 commented 10 months ago

OTOH podman run --network none -u dangerzone doesn't mean that a dangerzone user should exist? In the package's postint file there's no such user created!

apyrgio commented 10 months ago

Ok, next question would be: if you try to run Dangerzone with AppArmor disabled, does it complete successfully? This way, we can narrow it down to just that.

OTOH podman run --network none -u dangerzone doesn't mean that a dangerzone user should exist? In the package's postint file there's no such user created!

Actually, a dangerzone user exists, but in the container image that the dangerzone package provides.

deeplow commented 10 months ago

OTOH podman run --network none -u dangerzone doesn't mean that a dangerzone user should exist? In the package's postint file there's no such user created!

Actually, a dangerzone user exists, but in the container image that the dangerzone package provides.

@j75 the same logic applies to /tmp/dangerzone.

The folder /tmp/dangerzone/ does not exist, maybe it is created by the application? Anyhow, creating it before the application starts does not change anything.

Basically /tmp/dangerzone should exist, but inside the container and not on your host system. That's why you're not finding that directory on your system.

j75 commented 10 months ago

if you try to run Dangerzone with AppArmor disabled, does it complete

I performed (as root) systemctl stop apparmor.service then I run again dangerzone but again it crashed with the same messages as above...

apyrgio commented 10 months ago

Thanks a lot for trying it out. Bummer that we still don't know the root cause.

So, we have a permission denied error at our hands, which manifests even with AppArmor / SELinux in non-enforcing mode. We're missing something here, and we need an Ubuntu environment to properly test it out.

Until we have such an environment, I have one more thing to ask. Can you:

  1. Have a terminal with sudo journalctl -f running in the background.
  2. Run Dangerzone against a file and wait for it to fail.
  3. Grab any new output from the journalctl window that looks like a warning/error and paste it here.

Thanks again for the help :slightly_smiling_face:

j75 commented 10 months ago

Here is the journal (I stopped the apparmor service before starting dangerzone):

nov. 13 20:08:25 pamela podman[108899]: 2023-11-13 20:08:25.900905923 +0100 CET m=+0.063919664 image pull  dangerzone.rocks/dangerzone
nov. 13 20:08:27 pamela podman[108899]: 2023-11-13 20:08:27.766267099 +0100 CET m=+1.929280833 volume create 27e87dca7b67feab43f67d1cc12f839b39b3fdefa9daa535569a3b4b697ff3af
nov. 13 20:08:27 pamela podman[108899]: 2023-11-13 20:08:27.7721333 +0100 CET m=+1.935147033 volume create 6cc9cd3eef9addb1e06287096a68c83f1a739f2d922c372dc7dca1f8588c7fab
nov. 13 20:08:27 pamela podman[108899]: 
nov. 13 20:08:27 pamela podman[108899]: 2023-11-13 20:08:27.780854636 +0100 CET m=+1.943868453 container create 51bb4d809a4d14ca18ca50a7c3b49c67af2ccfb81cc34a76fa3a94a04dcb78c1 (image=dangerzone.rocks/dangerzone:latest, name=thirsty_wing)
nov. 13 20:08:27 pamela systemd[5300]: Started libpod-conmon-51bb4d809a4d14ca18ca50a7c3b49c67af2ccfb81cc34a76fa3a94a04dcb78c1.scope.
nov. 13 20:08:27 pamela systemd[5300]: Started libcontainer container 51bb4d809a4d14ca18ca50a7c3b49c67af2ccfb81cc34a76fa3a94a04dcb78c1.
nov. 13 20:08:27 pamela podman[108899]: 2023-11-13 20:08:27.988849591 +0100 CET m=+2.151863317 container init 51bb4d809a4d14ca18ca50a7c3b49c67af2ccfb81cc34a76fa3a94a04dcb78c1 (image=dangerzone.rocks/dangerzone:latest, name=thirsty_wing)
nov. 13 20:08:28 pamela podman[108899]: 2023-11-13 20:08:28.002558323 +0100 CET m=+2.165572060 container start 51bb4d809a4d14ca18ca50a7c3b49c67af2ccfb81cc34a76fa3a94a04dcb78c1 (image=dangerzone.rocks/dangerzone:latest, name=thirsty_wing)
nov. 13 20:08:28 pamela podman[108899]: 2023-11-13 20:08:28.002644484 +0100 CET m=+2.165658229 container attach 51bb4d809a4d14ca18ca50a7c3b49c67af2ccfb81cc34a76fa3a94a04dcb78c1 (image=dangerzone.rocks/dangerzone:latest, name=thirsty_wing)
nov. 13 20:08:28 pamela conmon[108940]: {"error": false, "text": "Calculating number of pages", "percentage": 3}
nov. 13 20:08:28 pamela conmon[108940]: {"error": false, "text": "Converting page 1/7 to pixels", "percentage": 9}
nov. 13 20:08:28 pamela conmon[108940]: {"error": false, "text": "Converting page 2/7 to pixels", "percentage": 15}
nov. 13 20:08:28 pamela conmon[108940]: {"error": false, "text": "Converting page 3/7 to pixels", "percentage": 22}
nov. 13 20:08:28 pamela conmon[108940]: {"error": false, "text": "Converting page 4/7 to pixels", "percentage": 28}
nov. 13 20:08:28 pamela conmon[108940]: {"error": false, "text": "Converting page 5/7 to pixels", "percentage": 35}
nov. 13 20:08:28 pamela conmon[108940]: {"error": false, "text": "Converting page 6/7 to pixels", "percentage": 41}
nov. 13 20:08:28 pamela conmon[108940]: {"error": false, "text": "Converting page 7/7 to pixels", "percentage": 48}
nov. 13 20:08:28 pamela conmon[108940]: {"error": true, "text": "[Errno 13] Permission denied: '/tmp/dangerzone/page-3.rgb'", "percentage": 48}
nov. 13 20:08:28 pamela conmon[108940]: Traceback (most recent call last):
nov. 13 20:08:28 pamela conmon[108940]:   File "<frozen runpy>", line 198, in _run_module_as_main
nov. 13 20:08:28 pamela conmon[108940]:   File "<frozen runpy>", line 88, in _run_code
nov. 13 20:08:28 pamela conmon[108940]:   File "/opt/dangerzone/dangerzone/conversion/doc_to_pixels.py", line 424, in <module>
nov. 13 20:08:28 pamela conmon[108940]:     sys.exit(asyncio.run(main()))
nov. 13 20:08:28 pamela conmon[108940]:              ^^^^^^^^^^^^^^^^^^^
nov. 13 20:08:28 pamela conmon[108940]:   File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
nov. 13 20:08:28 pamela conmon[108940]:     return runner.run(main)
nov. 13 20:08:28 pamela conmon[108940]:            ^^^^^^^^^^^^^^^^
nov. 13 20:08:28 pamela conmon[108940]:   File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
nov. 13 20:08:28 pamela conmon[108940]:     return self._loop.run_until_complete(task)
nov. 13 20:08:28 pamela conmon[108940]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
nov. 13 20:08:28 pamela conmon[108940]:   File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
nov. 13 20:08:28 pamela conmon[108940]:     return future.result()
nov. 13 20:08:28 pamela conmon[108940]:            ^^^^^^^^^^^^^^^
nov. 13 20:08:28 pamela conmon[108940]:   File "/opt/dangerzone/dangerzone/conversion/doc_to_pixels.py", line 418, in main
nov. 13 20:08:28 pamela conmon[108940]:     with open("/tmp/dangerzone/captured_output.txt", "wb") as container_log:
nov. 13 20:08:28 pamela conmon[108940]:          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
nov. 13 20:08:28 pamela conmon[108940]: PermissionError: [Errno 13] Permission denied: '/tmp/dangerzone/captured_output.txt'
nov. 13 20:08:28 pamela podman[108979]: 2023-11-13 20:08:28.481707747 +0100 CET m=+0.035670915 container died 51bb4d809a4d14ca18ca50a7c3b49c67af2ccfb81cc34a76fa3a94a04dcb78c1 (image=dangerzone.rocks/dangerzone:latest, name=thirsty_wing)
nov. 13 20:08:29 pamela podman[108979]: 2023-11-13 20:08:29.107805994 +0100 CET m=+0.661769217 container remove 51bb4d809a4d14ca18ca50a7c3b49c67af2ccfb81cc34a76fa3a94a04dcb78c1 (image=dangerzone.rocks/dangerzone:latest, name=thirsty_wing)
nov. 13 20:08:29 pamela podman[108979]: 2023-11-13 20:08:29.114887374 +0100 CET m=+0.668850597 volume remove 27e87dca7b67feab43f67d1cc12f839b39b3fdefa9daa535569a3b4b697ff3af
nov. 13 20:08:29 pamela podman[108979]: 2023-11-13 20:08:29.122045761 +0100 CET m=+0.676008989 volume remove 6cc9cd3eef9addb1e06287096a68c83f1a739f2d922c372dc7dca1f8588c7fab

pamela is the name of my computer... (in the memory of a former cat, RIP)

deeplow commented 10 months ago

Thanks for pasting that in (RIP pamela :cry: ). Unfortunately, it appears that no warning appeared there before the issue, so we'll have to dig deeper in an fully ubuntu install of our own to try and reproduce this.

One detail that I noticed though is that this time it failed on page 3 instead of 9. So, at some arbitrary point the directory to which the isolated environment is trying to use becomes unwritable. So I wonder if this is time-related or space related.

So if you're up for helping tracking this down a little more, I have one more request for you: let's check if this is a space limitation on your system. On a terminal, if you type df -h, what does it show? For example, for me on of the lines shows:

Filesystem          Size  Used Avail Use% Mounted on
/dev/mapper/dmroot   20G  6.9G   12G  38% /
none                 20G  6.9G   12G  38% /usr/lib/modules
devtmpfs            4.0M     0  4.0M   0% /dev
tmpfs               1.0G     0  1.0G   0% /dev/shm
tmpfs                68M  744K   67M   2% /run
tmpfs               1.0G  4.0K  1.0G   1% /tmp
/dev/xvdb            30G   13G   18G  42% /rw
tmpfs                34M   80K   34M   1% /run/user/1000

for the line about /tmp (right column) this means that the /tmp "directory" has 1GB and is 1% full (usually this fills up as Dangerzone runs), but other ones could be affected as well. Then if you could, type watch -n 0.1 df -h and that'll show you an updated version, then run Dangerzone in another window and see if any of those quickly spikes up to almost 100%?

j75 commented 10 months ago

df -k =>

Filesystem                              1K-blocks      Used Available Use% Mounted on
tmpfs                                     3270548     14928   3255620   1% /run
/dev/sda2                               155772240  77128744  70657936  53% /
tmpfs                                    16352724     84944  16267780   1% /dev/shm
tmpfs                                        5120         4      5116   1% /run/lock
tmpfs                                    16352724         0  16352724   0% /run/qemu
tmpfs                                     8388608     12840   8375768   1% /tmp
/dev/sdb1                               442369096 339364556  80460160  81% /mnt/vm
/dev/sda6                               256921924 210286140  33512200  87% /usr/local
/dev/sda5                               296136384 253681808  27338700  91% /home
/dev/sda7                               247139228 204590860  29921632  88% /opt
/dev/sda1                                  523248      6220    517028   2% /boot/efi
/dev/sda3                                  202752    157284     45468  78% /var/log
/dev/sda4                                  305152    127892    177260  42% /var/log/audit
tmpfs                                     3270544       224   3270320   1% /run/user/1602

Runing dangerzone and watch only show a modest increase in /dev/shm and /tmp (55 M and 13 M).

deeplow commented 10 months ago

Thanks for checking. Another dead-end. It's not a space issue either.

j75 commented 10 months ago

I tested the CLI version (dangerzone-cli --output-filename /tmp/test-safe.pdf /tmp/test.pdf) where the test document is https://s1.q4cdn.com/806093406/files/doc_downloads/test.pdf; in the journal I have

...
Nov 14 08:54:28 pamela podman[157531]: 2023-11-14 08:54:28.708554958 +0100 CET m=+2.452158506 container attach 1a16e5d2f5b5a16312df72dbe058ac70c6d5a2ab19d489d5729c65a316034a57 (image=dangerzone.rocks/dangerzone:latest, name=ecstatic_lamport)
Nov 14 08:54:28 pamela conmon[157808]: {"error": false, "text": "Calculating number of pages", "percentage": 3}
Nov 14 08:54:29 pamela conmon[157808]: {"error": false, "text": "Converting page 1/1 to pixels", "percentage": 48}
Nov 14 08:54:29 pamela conmon[157808]: {"error": true, "text": "[Errno 13] Permission denied: '/tmp/dangerzone/page-1.rgb'", "percentage": 48}
Nov 14 08:54:29 pamela conmon[157808]: Traceback (most recent call last):
Nov 14 08:54:29 pamela conmon[157808]:   File "<frozen runpy>", line 198, in _run_module_as_main
Nov 14 08:54:29 pamela conmon[157808]:   File "<frozen runpy>", line 88, in _run_code
Nov 14 08:54:29 pamela conmon[157808]:   File "/opt/dangerzone/dangerzone/conversion/doc_to_pixels.py", line 424, in <module>
Nov 14 08:54:29 pamela conmon[157808]:     sys.exit(asyncio.run(main()))
Nov 14 08:54:29 pamela conmon[157808]:              ^^^^^^^^^^^^^^^^^^^
Nov 14 08:54:29 pamela conmon[157808]:   File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
Nov 14 08:54:29 pamela conmon[157808]:     return runner.run(main)
Nov 14 08:54:29 pamela conmon[157808]:            ^^^^^^^^^^^^^^^^
Nov 14 08:54:29 pamela conmon[157808]:   File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
Nov 14 08:54:29 pamela conmon[157808]:     return self._loop.run_until_complete(task)
Nov 14 08:54:29 pamela conmon[157808]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 14 08:54:29 pamela conmon[157808]:   File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
Nov 14 08:54:29 pamela conmon[157808]:     return future.result()
Nov 14 08:54:29 pamela conmon[157808]:            ^^^^^^^^^^^^^^^
Nov 14 08:54:29 pamela conmon[157808]:   File "/opt/dangerzone/dangerzone/conversion/doc_to_pixels.py", line 418, in main
Nov 14 08:54:29 pamela conmon[157808]:     with open("/tmp/dangerzone/captured_output.txt", "wb") as container_log:
Nov 14 08:54:29 pamela conmon[157808]:          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Nov 14 08:54:29 pamela conmon[157808]: PermissionError: [Errno 13] Permission denied: '/tmp/dangerzone/captured_output.txt'
Nov 14 08:54:29 pamela podman[157531]: 2023-11-14 08:54:29.149117141 +0100 CET m=+2.892720648 container died 1a16e5d2f5b5a16312df72dbe058ac70c6d5a2ab19d489d5729c65a316034a57 (image=dangerzone.rocks/dangerzone:latest, name=ecstatic_lamport)
j75 commented 10 months ago

Now I'm trying dangerzone on another computer (a laptop) and the messages are different:

Copying blob 9bb71642aacd done  
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
[ERROR] Failed to install the container image

However the graphical applications still asks me to supply a document - normally it shouldn't because the podman image is not installed... but that's another issue...

deeplow commented 10 months ago

Now I'm trying dangerzone on another computer (a laptop) and the messages are different:

Copying blob 9bb71642aacd done  
Error: payload does not match any of the supported image formats (oci, oci-archive, dir, docker-archive)
[ERROR] Failed to install the container image

However the graphical applications still asks me to supply a document - normally it shouldn't because the podman image is not installed... but that's another issue...

I'd say this is a different issue. Likely caused by an almost full disk. We can discuss that further here: https://github.com/freedomofpress/dangerzone/issues/193#issuecomment-1239052750

j75 commented 10 months ago

Strange... I have over 100G free in my /home partition, and normally podman creates the image in the ~/.local/share/containers/storage/

deeplow commented 10 months ago

Just installed Ubuntu 22.04 and ran Dangerzone successfully on a 66 page document and a 4 page one without running into issues. So unfortunately I was unable to reproduce this issue. Without it I don't know what else I can use to find out what the root cause could be.

apyrgio commented 10 months ago

Sigh, how did I miss it:

[INFO] > /usr/bin/podman run [...] -v /tmp/user/1602/tmppynjffwh/pixels:/tmp/dangerzone:Z [...]

This command tells us two things:

  1. When Dangerzone requested for a temporary directory, it got back a per-user temporary directory (/tmp/user/...).
  2. The UID is a bit unexpected (1602 vs 100{0,1}).

I'd check the following next:

  1. Does /etc/fstab have an entry for /tmp? Do you have pam-tmpdir installed? Is this intentional?

  2. What are the permissions of the /tmp/user/1602 directory? What is your current user?

     stat /tmp/user/1602
     id
  3. What's the value of the TMPDIR envvar:

    echo $TMPDIR
  4. Finally, does Dangerzone work if you invoke it with TMPDIR= dangerzone ?

@deeplow Can you check it as well in your Ubuntu 22.04 environment?

j75 commented 10 months ago

Yes, my user id is 1602 hence that value. And my "temp" variables are

TEMPDIR=/tmp/user/1602
TMPDIR=/tmp/user/1602
TEMP=/tmp/user/1602
TMP=/tmp/user/1602

I am the owner of the /tmp/user/1602 folder but the group is not mine but root. /tmp has an entry in /etc/fstab and the libpam-tmpdir package is installed on my system.

j75 commented 10 months ago

I tried dangerzone on another computer (my laptop, also on Ubuntu 22.04.03), using TMPDIR pointing to a folder where is enough space, the messages are similar to those on my desktop computer:

Loaded image(s): dangerzone.rocks/dangerzone:latest
[INFO] Container image installed
[INFO] Assigning ID '9QB7vA' to doc '/tmp/test.pdf'
[DEBUG] Removing all documents
[DEBUG] Marking doc 9QB7vA as 'converting'
[INFO] > /usr/bin/podman run --network none -u dangerzone --security-opt no-new-privileges --userns keep-id --cap-drop all --rm -v /home/marian/tmp/dangerzone/tmp9od0r_s_/unsafe/input_file:/tmp/input_file:Z -v /home/marian/tmp/dangerzone/tmp9od0r_s_/pixels:/tmp/dangerzone:Z -e ENABLE_TIMEOUTS=1 dangerzone.rocks/dangerzone /usr/bin/python3 -m dangerzone.conversion.doc_to_pixels
[INFO] [doc 9QB7vA] 3% UNTRUSTED> Calculating number of pages
[INFO] [doc 9QB7vA] 48% UNTRUSTED> Converting page 1/1 to pixels
[ERROR] [doc 9QB7vA] 48% UNTRUSTED> [Errno 13] Permission denied: '/tmp/dangerzone/page-1.rgb'
[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED> Traceback (most recent call last):

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "<frozen runpy>", line 198, in _run_module_as_main

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "<frozen runpy>", line 88, in _run_code

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/opt/dangerzone/dangerzone/conversion/doc_to_pixels.py", line 424, in <module>

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>     sys.exit(asyncio.run(main()))

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>              ^^^^^^^^^^^^^^^^^^^

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>     return runner.run(main)

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>            ^^^^^^^^^^^^^^^^

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>     return self._loop.run_until_complete(task)

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>     return future.result()

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>            ^^^^^^^^^^^^^^^

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>   File "/opt/dangerzone/dangerzone/conversion/doc_to_pixels.py", line 418, in main

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>     with open("/tmp/dangerzone/captured_output.txt", "wb") as container_log:

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED>          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[ERROR] [doc 9QB7vA] -1% Invalid JSON returned from container:

    UNTRUSTED> PermissionError: [Errno 13] Permission denied: '/tmp/dangerzone/captured_output.txt'

[ERROR] documents-to-pixels failed
[ERROR] An exception occurred while converting document '9QB7vA'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/base.py", line 39, in convert
    success = self._convert(document, ocr_lang)
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/container.py", line 257, in _convert
    return self._convert_with_tmpdirs(
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/container.py", line 311, in _convert_with_tmpdirs
    raise exception_from_error_code(ret)  # type: ignore [misc]
  File "/usr/lib/python3/dist-packages/dangerzone/conversion/errors.py", line 126, in exception_from_error_code
    raise ValueError(f"Unknown error code '{error_code}'")
ValueError: Unknown error code '1'
[ERROR] [doc 9QB7vA] 0% Unknown error code '1'
[DEBUG] Marking doc 9QB7vA as 'failed'

I presume that /tmp/dangerzone is in the podman's container.

j75 commented 10 months ago

However, as far as I understand the error messages are in the podman's container, not on my system! So it should not matter what is as long as podman works, no? Because when I simply test podman it seems to work on my computer:

% podman run quay.io/podman/hello
!... Hello Podman World ...!

         .--"--.           
       / -     - \         
      / (O)   (O) \        
   ~~~| -=(,Y,)=- |         
    .---. /`  \   |~~      
 ~/  o  o \~~~~.----. ~~   
  | =(X)= |~  / (O (O) \   
   ~~~~~~~  ~| =(Y_)=-  |   
  ~~~~    ~~~|   U      |~~ 

Project:   https://github.com/containers/podman
Website:   https://podman.io
Documents: https://docs.podman.io
Twitter:   @Podman_io
j75 commented 10 months ago

My podman version is 3.4.4

deeplow commented 10 months ago

@deeplow Can you check it as well in your Ubuntu 22.04 environment?

I did so and it worked. By default it used /tmp/tmp<random_hex>/ and after installing libpam-tmpdir it ran under /tmp/user/1000/tmp<random_hex> and in a second user under /tmp/user/1001/tmp<random_hex>. In both cases the conversion was successful.

j75 commented 10 months ago

what about using another user id (different from 1000)? As far as I understand, podman mounts a volume /tmp/user/1602/<random>/unsafe/input_file:/tmp/input_file:Z the user being dangerzone (could we mount a file instead of a folder? what means that Z parameter?)

j75 commented 10 months ago

I managed to reproduce the problem by running a simple script that executes podman run --network none -u dangerzone --security-opt no-new-privileges --userns keep-id --cap-drop all --rm -v ./mytmp/input_file:/tmp/input_file:Z dangerzone.rocks/dangerzone /usr/bin/python3 -m dangerzone.conversion.doc_to_pixels with the same errors.. but the errors disappear if I create the mytmp/pixeld folder with a drwxrwxrwx mode! So I think the problem is only due to the ownership of the output folder (in which I find the following files: captured_output.txt page-1.height page-1.rgb page-1.width, all belonging to 166536:166536 user/group)

apyrgio commented 10 months ago

Hey @j75, sorry for the delay. I see you've made a lot of progress in this issue, nice! So, we have a workaround, but it has questionable security implications.

First of all, I'll reply on your comments to clarify some things, and then I'll suggest some next steps:

Yes, my user id is 1602 [...] I am the owner of the /tmp/user/1602 folder but the group is not mine but root. /tmp has an entry in /etc/fstab and the libpam-tmpdir package is installed on my system.

Great, thanks for the info. In theory, the fact that this directory is owned by you should still make Dangerzone work.

I tried dangerzone on another computer (my laptop, also on Ubuntu 22.04.03), using TMPDIR pointing to a folder where is enough space, the messages are similar to those on my desktop computer:

Ok, let's shelf the "out of space" theory then.

However, as far as I understand the error messages are in the podman's container, not on my system! So it should not matter what is as long as podman works, no?

Kind of. The error messages do come from Podman, but the problem is the mounted directory in the container, which is part of the host. But we'll see it later on.

what about using another user id (different from 1000)?

Bingo, that's the problem. Here's the thing, if you check the Podman invocation, there's a --userns keep-id argument. This invocation basically binds the user ID in the host (here 1602) with the exact same user ID in the container (here it will be 1602). Why do we use this argument? Because we want the user in the container to write in the mounted temp dir (the /tmp/dangerzone dir that you're seeing in the error logs).

So what's the issue here? The dangerzone user in the container image has user ID 1000. So, the --userns keep-id argument is invalid in your case.


So, where are we going from here now? Unfortunately, this is something that needs to be solved at the Dangerzone level. This can be solved as a side effect of fixing #443. This will require some redesign in the application, so it may take a while to land.

In the meantime you can use the drwxrwxrwx workaround, ideally with a throwaway temp dir just for Dangerzone. I know that it's not ideal, but I guess it's the best option for now.

j75 commented 10 months ago

you can use the drwxrwxrwx workaround - what the heck is this workaround actually? If you are talking about my message about running podman manually, how could I then recover the PDF? (even converted to a series of images). According to file command, page-1.rgb is ISO-8859 text, with very long lines (65536), with no line terminators.

apyrgio commented 10 months ago

Oh I'm sorry, I should probably elaborate. The workaround is to let Dangerzone create world-writeable directories (drwxrwxrwx), which it can then mount to the container. As I mentioned before, I can't attest to the security of this approach, especially if unprivileged malicious processes run in the same environment.

For proof-of-concept reasons, you can use the following script:

#!/bin/bash

ORIG_UMASK=$(umask)
ORIG_TMPDIR=$TMPDIR

umask 0077
export TMPDIR=$(mktemp -d)
umask 0000
echo "Dangerzone will create world-writable dirs in $TMPDIR"
dangerzone
rm -rf $TMPDIR

umask $ORIG_UMASK
export TMPDIR=$ORIG_TMPDIR

It will basically create a directory that is only accessible by your user (umask 0077 => drwx------) and then let Dangerzone create world-writeable directories in it (umask 0000 => drwxrwxrwx). It then invokes Dangerzone, and cleans up the temporary directory afterwards.

I should stress again that this is a hacky script, so please don't use it for anything other than evaluating Dangerzone :slightly_smiling_face:

j75 commented 10 months ago

Yes, your script solves my problem (and I used sh not bash as the first is POSIX compatible, if I'm not wrong).

A small remark (should it be another issue?): the original document (https://s1.q4cdn.com/806093406/files/doc_downloads/test.pdf) has Page size: 612 x 792 pts (letter) the test-safe.pdf document has Page size: 1275 x 1650 pts

deeplow commented 10 months ago

Yes, your script solves my problem (and I used sh not bash as the first in POSIX compatible, if I'm not wrong).

Good to hear!

A small remark (should it be another issue?): the original document (https://s1.q4cdn.com/806093406/files/doc_downloads/test.pdf) has Page size: 612 x 792 pts (letter) the test-safe.pdf document has Page size: 1275 x 1650 pts

I had just come across that myself independently not even 20 minutes ago (what a coincidence). I opened the issue #626 to address it,

apyrgio commented 10 months ago

Good to know that we have a workaround for your case @j75. I say we put this issue on hold and actually resolve it once #443 is resolved.

As for the page size issue you mentioned, thanks for the report, we'll tackle it independently :slightly_smiling_face:

nealrauhauser commented 6 months ago

I have just started trying to use Dangerzone and I think I'm running into this problem as well.

My system is Ubuntu Budgie 22.04.4, podman 3.4.4 is installed, and all my storage is ZFS. When running as an unprivileged user it reports:

Error: 'overlay' is not supported over zfs, a mount_program is required: backing file system is unsupported for this graph driver

Full messages here:

https://gist.github.com/nealrauhauser/142cee89d9615152820f78dbc73c5e8b

If I run dangerzone-cli as root the error report offers much less context.

ERROR pixels-to-pdf failed ERROR [doc NM8lrC] 0% Unknown error code '1'

https://gist.github.com/nealrauhauser/e8a3648459393766ee42007c1463cd9d

I have a laptop that's using ext4 and I get similar failures.

I'm willing to put some time into resolving this, but could really use some hints on where to start.

apyrgio commented 6 months ago

The issue you're encountering seems to be that rootless Podman (which is the underlying container runtime that Dangerzone uses on Linux) cannot run on ZFS. I don't have a ZFS system, unfortunately, but there are several articles on the internet that offer some solutions for the issue (e.g., https://blog.dest-unreach.be/2024/01/03/podman-on-zfs/).

I was about to move the above comment to a separate issue ("Cannot run Dangerzone on ZFS systems"), but then that line threw me off:

I have a laptop that's using ext4 and I get similar failures.

If you can give us some logs from that system, it would help.