goldmann / docker-squash

Docker image squashing tool
MIT License
848 stars 109 forks source link

Do not delete temporary directory #169

Closed firesharker closed 9 months ago

firesharker commented 6 years ago

I was short on space on my system disc, so I specified a temporary directory on an other drive.

I thought docker-squash will do it's business inside and remove every file it created. I learned the hard way that this is not the case and successfully managed to recover most of the contents of my main directory with 'testdisk' after 'docker-squash' deleted the whole thing with the following logs:

docker-squash image:tag -f hash -t image:tag --tmp-dir /path/to/an/existing/directory/
2018-04-19 15:30:23,505 root         INFO     docker-squash version 1.0.6, Docker 0520e24, API 1.37...
2018-04-19 15:30:23,505 root         INFO     Using v2 image format
2018-04-19 15:30:23,505 root         ERROR    Preparing temporary directory failed
2018-04-19 15:30:23,505 root         ERROR    Execution failed, consult logs above. If you think this is our fault, please file an issue: https://github.com/goldmann/docker-squash/issues, thanks!

I think if a temporary directory specified by the user exists, docker-squash should create a temporary directory inside or fail without deleting it.

I am willing to submit a pull request if you decide which path to take.

goldmann commented 6 years ago

OK, let's go through it.

If you do not specify the --tmp-dir, default a temporary directory will be created on the host, most probably as /tmp/docker-squash-RANDOM_STRING. This directory will be removed after squashing is finished.

Same applies to directory provided with the --tmp-dir parameter - this directory will be removed once squashing is finished.

In case you want to debug failure, you can leave the temporary directory for investigation, if you use the -d flag.

From what I understand you specified the --tmp-dir to an existing directory in which you had some files you cared about. Please note that name of this flag is "temporary directory" which implies that this is a short living directory. I would say that the only thing we should do is to extend the CLI --help page to mention that it will be removed after squashing is done.

firesharker commented 6 years ago

I see your point. If we leave everything unchanged, I would definitely mention this behavior in the --help page. However usually (correct me if I am wrong), if a program asks for a temporary directory and you give '/var/tmp' they won't delete it. More likely they will create a subdirectory to work in, or put files directly into the specified directory, then remove everything cerated by them upon exit. I think this is the expected behavior unless clearly stated otherwise.

goldmann commented 3 years ago

It would be good to update the --help output. I would not want to change the behavior and the reason is that temporary directory is internal to how the program works. In fact the output that is there is not that meaningful for successful builds, it would just waste (sometimes a lot of!) space.

drjasonharrison commented 2 years ago

I definitely expected that if I specified --tmp-dir /tmp that /tmp would not be deleted when the process finished. Perhaps a check to see if the --tmp-dir is empty before starting and giving a warning or exiting.

In my case /tmp was too small (or full) and on the second run I specified my /data partition. Which had data that was "just" a local cache of stuff I could copy again. But definitely not the expected behavior.

pip can use TMP_DIR but it doesn't delete it afterwards.

matthewgream commented 1 year ago

This is a critical issue, and needs to be fixed. I see it's still here 4-5 years later !

I also had the problem that I started with reference to an existing tmp dir ("/disks/dev/tmp") which I'm using for other purposes -- and after it complained about "Preparing temporary directory failed", I found my tmp dir to be wiped, including everything that was already in it.

You could blame user error (since the help does say directory to be "created and used"), but I think this is fundamentally a bad design: don't ever wipe out something you didn't create!

goldmann commented 9 months ago

I'm changing my mind! Let's NOT delete the directory and only create it if it doesn't exist.

goldmann commented 9 months ago

And one more thing: sorry everyone for the inconvenience this could cause!

rnc commented 9 months ago

I've traced this through and what is happening is while the tmp_dir flag does have sufficient checks ( https://github.com/goldmann/docker-squash/blob/main/docker_squash/image.py#L303 ) it conflicts with the development flag which, if not set will cleanup the directory regardless ( https://github.com/goldmann/docker-squash/blob/main/docker_squash/squash.py#L98 ).

I think an approach would be to merge these two flags - if tmp_dir is set, treat it as development, and don't perform any cleanup. This would simplify the UI and also avoid this issue.