mar10 / pyftpsync

Synchronize directories using FTP(S), SFTP, or file system access.
https://pyftpsync.readthedocs.io
MIT License
117 stars 25 forks source link

glob #53

Closed mar10 closed 3 years ago

mar10 commented 3 years ago

Implementation

Current implementation (pyftpsync <= 3)

--match patterns are only applied on file names. It uses fnmatch syntax. --exclude patterns are applied after --match and work on file and directory names.

New implementation (pyftpsync, 'glob' branch)

--match uses the glob syntax from wcmatch.globmatch Glob generally matches entry names, i.e. directories or files. The special ** pattern matches zero or more directories, a ! prefix negates a pattern, etc. --exclude patterns are still available (applied after --match)

Open Questions

Assuming this structure:

file1.txt
file2.yaml
folder3/
    files_3_1.txt
folder4/
    files_4_1.yaml

Using this pattern --match "*.txt" in the old approach would match:

file1.txt
folder3/    <= directories not affected by '--match'
    files_3_1.txt

Using this glob pattern --match "*.txt" would match one entry in the new approach:

file1.txt

Using a hierarchical glob pattern like --match "**/*.txt" would match two entries

file1.txt
folder3/    <= not matched!
    files_3_1.txt

Question 1

When combined with the --delete-unmatched option, what should happen?

One could assume that folder4/ gets deleted, but folder3/ remains, because it contains content. However, this requires to traverse folders depth-first in order to decide if they have remaining content.

The precious approach never discarded directories with a --match pattern. Discarding directories was done using --exclude folder4 for example.

Question 2

The common use case 'only text files' must be passed as --match "**/*.txt". If we decide to traverse all folders (matching or not) in order to apply the pattern to descendants, we still would have to prepend **/ to the patterns. This seems redundant, especially compared to the current approach where --match *.txt works on file names in all levels.

Conclusion

Currently I am tending towards dropping this new feature, since it seems more confusing than the existing approach.

github-actions[bot] commented 3 years ago

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.