containerbuildsystem / cachi2

Cachi2 is a CLI tool that pre-fetches your project's dependencies to aid in making your build process network-isolated.
GNU General Public License v3.0
7 stars 25 forks source link

Drop the reflink dependency #578

Closed eskultety closed 3 weeks ago

eskultety commented 1 month ago

The reason for this effort is that the reflink library [1] was created as an attempt to make use of the reflink optimization before python gained support for the os.copy_file_range syscall. The library was never really anything more than a band-aid and now that it's possible to use a syscall the library even mentions on its GitHub page that Python now implements the functionality natively.

The implementation was taken from (with some 3.9+ tweaks applied) from an existing code proposal [2] (marked as "awaiting merge") to add the same functionality to the 'shutil' library copying primitives and make it completely transparent to end users. We'd have to wait a long time to be able to use it though. Compared to the reflink library, which used a dirty trick of copying a small file first (in kinda error-prone way) to see if the operation raised an exception, os.copy_file_range based solution succeeds in vast majority of cases because if reflinks are not supported within the underlying file system (which is nothing more than inode sharing) a copy without the overhead of userspace <-> kernel can still continue normally, hence reserving the 'shutil.copy2' fallback to really obscure cases (like cross-device copying - EXDEV OR on old systems without the syscall - ENOSYS) or simply cases where the copying failed for some reason which we may not even encounter ever.

[1] https://gitlab.com/rubdos/pyreflink [2] https://github.com/python/cpython/pull/93152/files

Maintainers will complete the following section

Note: if the contribution is external (not from an organization member), the CI pipeline will not run automatically. After verifying that the CI is safe to run: