qarmin / czkawka

Multi functional app to find duplicates, empty folders, similar images etc.
Other
20.2k stars 656 forks source link

Czkawka deletes duplicate files without warning or asking for permissions. [EXTREMELY DANGEROUS] #46

Closed she3o closed 8 months ago

she3o commented 4 years ago

when I run $ czkawka_cli dups -d $PWD, czkawka finds duplicate files and deletes them without asking for permission, giving a warning or showing that files were deleted in its output:

$ dd if=/dev/urandom of=file-1 bs=64M count=16
$ cp file-1 file-2
$ czkawka_cli dup -d $PWD
    Found 2 duplicated files in 1 groups with same content which took 512.00 MiB:
    Size - 512.00 MiB (536870896) - 2 files
    /home/she3sha3y/dups/file-1
    /home/she3sha3y/dups/file-2
----

czkawka deleted file-2 without saying in the output. Help does not say that this command deletes files

$ czkawka_cli dup -h
   czkawka_cli-dup 1.0.0
   Finds duplicate files

This is very VERY dangerous. If I run this from the home directory, or worse, as root from the root directory, it could break a system and I would not even know. Many language libraries have duplicate files e.g. use the same package manager, Python with init.py. I could have two similar binaries with different filenames. Breaking a system is as easy as (DON'T TRY THIS! ) sudo czkawka_cli dups -d/.

IMO the command should just print out dups and an option should be explicitly typed to invoke deletion (after asking permission) and preferably it should refuse to run at all as the root user or in directories like /usr/ /etc/ /bin/ and exit gracefully with a message Can't run this command as root, Can't run this command in /bin/ as it could potentially break your system.

This should be fixed urgently! Thank you in advance. 😀

qarmin commented 4 years ago

I think that disallowing users using this tool with super user privileges is bad, since users may want to search duplicates e.g. in /home folder to which doesn't have enough privileges as normal user.

Since Czkawka provide GUI frontend, CLI will be mostly used by more advanced users which will know that using delete flags will remove files.

I think that adding interactive mode should be used by default, and -d flag should be used only with -q quiet flag

she3o commented 4 years ago

I was thinking of yay when the command refuses to run as a superuser but, interactive mode sounds better.

some advanced users might want czkawka to only print out dups by default (with -q and without -D). that would be the most unixy behavior. That way they can pipe | and overwrite > with the stdout.

blob79 commented 3 years ago

The situation is a bit better now. With https://github.com/qarmin/czkawka/pull/277 we have dryrun.

I didn't see the unix feature request in this ticket when I cut https://github.com/qarmin/czkawka/issues/258.

wwalker commented 1 year ago

@she3o is still correct. The help says:

@  $ czkawka_cli dup -h
Finds duplicate files
^^^^^

Finds is VERY different from Deletes by default

The help message really should be changed. It should at least say: Finds and deletes duplicate files

And in the main help, the word delete is not used even once, although that is its default behavior. Essentially czkawka is a very powerful rm command. The primary description of rm is:

       rm - remove files or directories

There is no mention of remove or delete in the czkawka_cli help messages.

USAGE:
    czkawka_cli <COMMAND> [SCFLAGS] [SCOPTIONS]

OPTIONS:
  -h, --help     Print help
  -V, --version  Print version

SUBCOMMANDS:
  dup            Finds duplicate files
  empty-folders  Finds empty folders
  big            Finds big files
  empty-files    Finds empty files
  temp           Finds temporary files
  image          Finds similar images
  music          Finds same music by tags
  symlinks       Finds invalid symlinks
  broken         Finds broken files
  video          Finds similar video files
  ext            Finds files with invalid extensions
  tester         Small utility to test supported speed of
  help           Print this message or the help of the given subcommand(s)
SkyWriter commented 11 months ago

I am not sure if that's still the case. Here's reproduction of the original issue:

➜  TestTmp czkawka -h
czkawka 6.1.0
...
➜  TestTmp czkawka dup -h 
Finds duplicate files

Usage: czkawka dup [OPTIONS] --directories <DIRECTORIES>
...
  -D, --delete-method <DELETE_METHOD>
          Delete method (AEN, AEO, ON, OO, HARD) [default: NONE]
...
➜  TestTmp  dd if=/dev/urandom of=file-1 bs=64M count=16
16+0 records in                                                                                                                         
16+0 records out                 
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 3.40453 s, 315 MB/s
➜  TestTmp cp file-1 file-2                        
➜  TestTmp czkawka dup -d $PWD  
Results of searching ["/tank/enclave/TestTmp"] with excluded directories [] and excluded items []
-------------------------------------------------Files with same hashes-------------------------------------------------
Found 1 duplicated files which in 1 groups which takes 1 GiB.

---- Size 1 GiB (1073741824) - 2 files       
/tank/enclave/TestTmp/file-2                                   
/tank/enclave/TestTmp/file-1                               
-------------------------------MESSAGES--------------------------------
Properly saved to file 2 cache entries.        
Properly saved to file 2 cache entries.
---------------------------END OF MESSAGES-----------------------------

➜  TestTmp ls -la                    
total 2099106                                                   
drwxrwxr-x  2 sky sky            4 Nov 30 19:10 .
drwxrwxr-x 14 sky users         18 Nov 30 19:01 ..         
-rw-rw-r--  1 sky sky   1073741824 Nov 30 19:10 file-1
-rw-rw-r--  1 sky sky   1073741824 Nov 30 19:10 file-2 

Not deleting seems to be a clear default, and my experiment confirmed the behavior.