freedomofpress / dangerzone

Take potentially dangerous PDFs, office documents, or images and convert them to safe PDFs
https://dangerzone.rocks/
GNU Affero General Public License v3.0
3.58k stars 168 forks source link

Converting a file from symlink while archiving the unsafe version results in the symlink (not original document) being archived #765

Open naglis opened 6 months ago

naglis commented 6 months ago

Tested on dangerzone 0.6.0 on Arch Linux. Steps to reproduce via CLI (AFAICT the outcome is the same if converting via GUI):

mkdir -p test_symlink/subdir
cd test_symlink
curl -sSL 'https://raw.githubusercontent.com/freedomofpress/dangerzone/main/tests/test_docs/sample-pdf.pdf' -o subdir/sample.pdf
ln -s subdir/sample.pdf
dangerzone-cli --archive sample.pdf
file unsafe/sample.pdf

The output from the file command on the last line:

unsafe/sample.pdf: broken symbolic link to subdir/sample.pdf

The symlink that was passed to dangerzone is archived, the original "unsafe" file at test_symlink/subdir/sample.pdf still exists. Since the symlink is renamed (moved), if the symlink was relative (i.e. it did not point to an absolute path) after archiving (moving) the symlink becomes broken. Since the intention of archival is to move the original potentially unsafe document to the unsafe directory and prevent the user from mistakenly opening it, the goal is not achieved.

I am not sure how probable it is that the user would make links to documents and then pass them to dangerzone, but I think the case should at least be known/considered. I also am not sure what the correct behavior in such case should be:

  1. The original file (the one that the symlink points to) should be archived. If so, what should happen to the symlink that was passed in as the path, since it would still exist and point to a non-existent file? Perhaps it is ok to leave it dangling?
  2. Do not allow to pass symlinks.
  3. Do not archive symlinks.
  4. Instead of renaming (moving) it, a correct symlink should be archived.
  5. Other?

3 and 4 do not achieve the goal of "marking" the original document as unsafe since it would still exist at the original path, so I do not think they are good solutions.

apyrgio commented 5 months ago

That's an interesting case. We can't move the file the symlink points to, since another symlink may point to that as well. On the other hand, it doesn't make sense to move just the symlink, as this is brittle (as you correctly pointed out).

What we should probably do here is:

  1. Resolve the symlink, and find the original file.
  2. Copy (but not move) the original file under unsafe/
  3. Remove the symlink.

This way, it's less easy for the user to accidentally click on the original file (that's the spirit of the archiving feature), and we are sure that they can recover the original file from the unsafe/ directory, even if they move stuff around (also very important).