pyinfra-dev / pyinfra

pyinfra turns Python code into shell commands and runs them on your servers. Execute ad-hoc commands and write declarative operations. Target SSH servers, local machine and Docker containers. Fast and scales from one server to thousands.
https://pyinfra.com
MIT License
3.89k stars 378 forks source link

Make get_fact(FindFiles) more versatile #1174

Open JakkuSakura opened 2 months ago

JakkuSakura commented 2 months ago

Is your feature request related to a problem? Please describe

Many times, I want to filter certain kinds of files, yet find can only show if a file exists. This could also be a performance issue when the number of files are larger.

Describe the solution you'd like

I would like to add a few arguments to Find.

Below are some maybe useful arguments I copied from man find. One shall also be able to pass custom arguments to get_fact(Find).

Pay special attention to -size n[ckMGTP]. 1k/M/G/T/P has pitfall and behave like all non-empty files. one should always use n c(chars) to work around

     -depth  Always true; same as the non-portable -d option.  Cause find to perform a
             depth-first traversal, i.e., directories are visited in post-order and all
             entries in a directory will be acted on before the directory itself.  By
             default, find visits directories in pre-order, i.e., before their contents.
             Note, the default is not a breadth-first traversal.

             The -depth primary can be useful when find is used with cpio(1) to process
             files that are contained in directories with unusual permissions.  It
             ensures that you have write permission while you are placing files in a
             directory, then sets the directory's permissions as the last thing.

     -depth n
             True if the depth of the file relative to the starting point of the
             traversal is n.

     -empty  True if the current file or directory is empty.
     -fstype type
             True if the file is contained in a file system of type type.  The lsvfs(1)
             command can be used to find out the types of file systems that are
             available on the system.  In addition, there are two pseudo-types, “local”
             and “rdonly”.  The former matches any file system physically mounted on the
             system where the find is being executed and the latter matches any file
             system which is mounted read-only.

     -gid gname
             The same thing as -group gname for compatibility with GNU find.  GNU find
             imposes a restriction that gname is numeric, while find does not.

     -group gname
             True if the file belongs to the group gname.  If gname is numeric and there
             is no such group name, then gname is treated as a group ID.

     -ignore_readdir_race
             Ignore errors because a file or a directory is deleted after reading the
             name from a directory.  This option does not affect errors occurring on
             starting points.

     -ilname pattern
             Like -lname, but the match is case insensitive.  This is a GNU find
             extension.

     -iname pattern
             Like -name, but the match is case insensitive.

     -name pattern
             True if the last component of the pathname being examined matches pattern.
             Special shell pattern matching characters (“[”, “]”, “*”, and “?”) may be
             used as part of pattern.  These characters may be matched explicitly by
             escaping them with a backslash (“\”).

     -regex pattern
             True if the whole path of the file matches pattern using regular
             expression.  To match a file named “./foo/xyzzy”, you can use the regular
             expression “.*/[xyz]*” or “.*/foo/.*”, but not “xyzzy” or “/foo/”.

     -samefile name
             True if the file is a hard link to name.  If the command option -L is
             specified, it is also true if the file is a symbolic link and points to
             name.

     -size n[ckMGTP]
             True if the file's size, rounded up, in 512-byte blocks is n.  If n is
             followed by a c, then the primary is true if the file's size is n bytes
             (characters).  Similarly if n is followed by a scale indicator then the
             file's size is compared to n scaled as:

             k       kilobytes (1024 bytes)
             M       megabytes (1024 kilobytes)
             G       gigabytes (1024 megabytes)
             T       terabytes (1024 gigabytes)
             P       petabytes (1024 terabytes)
Fizzadar commented 1 month ago

Hi @JakkuSakura - I assume you mean the FindFiles or similar facts? Extending the arguments taken should be pretty simple and allow specifying any find flags, source: https://github.com/pyinfra-dev/pyinfra/blob/aad6c3b84afc56f40c36b9e53dd277ce3c51b916/pyinfra/facts/files.py#L325-L330

JakkuSakura commented 1 month ago

Yeah I mean the find files. If my proposal sounds plausible, I'd happy to implement it to support any parameters + footgun prevention mechanisms