minio / mc

Simple | Fast tool to manage MinIO clusters :cloud:
https://min.io/download
GNU Affero General Public License v3.0
2.83k stars 541 forks source link

Support copying with pattern matching while preserving directory structure (--include, not just --exclude) #4871

Open jackgray opened 6 months ago

jackgray commented 6 months ago

Is your feature request related to a problem? Please describe. Currently it is not possible to sync two locations while filtering with pattern matching

There are half a dozen ways cp and rsync accomplish this, and NONE of them are offered in mc mirror or mc cp

mc mirror has the --exclude option, but this can oftentimes be totally useless, if you want to copy only a subset of a wide and complex dataset. Minio claims to be useful for complex data management.

mc find has the --regex and --exec options, but no option to preserve the directory structure, so that all files matched are sent to the same flat single-level destination directory.

rclone offers these functions syntactically, but in practice there are issues with Min.IO in either erasure encoding, chunk size, or something else that is not mentioned in https://github.com/astaxie/cookbook/blob/master/docs/rclone-with-minio.md (which seems to be an unofficial fork of mc docs because of the unanswered request to add rclone to minio documentation)

Describe the solution you'd like minio should offer SOME WAY of syncing files between buckets more precisely than dumping matched patterns in a single directory or having to copy all of the data and retroactively remove files you didn't want

mc mirror should have --include flag

mc cp should have --relative flag

Describe alternatives you've considered rclone almost achieves the desired effect, but fails with Failed to copy: InvalidArgument: Range specified is not valid for source object for large files.

harshavardhana commented 6 months ago

Moving to correct repo