Open giampaolo opened 5 years ago
This is a follow up of bpo-33639 (zero-copy via sendfile()) and bpo-26828 (os.copy_file_range()). On [Linux 4.5 / glib 2.27] shutil.copyfile() will use os.copy_file_range() instead of os.sendfile(). According to my benchmarks performances are the same but when dealing with NFS copy_file_range() is supposed to attempt doing a server-side copy, meaning there will be no exchange of data between client and server, making the copy operation an order of magnitude faster.
Before proceeding unit-tests for big-file support should be added first (bpo-37096). We didn't hit the 3.8 deadline but I actually prefer to land this in 3.9 as I want to experiment with it a bit (copy_file_range() is quite new, bpo-26828 is still a WIP).
Oh, I already created https://bugs.python.org/issue37157
Can we move the discussion there?
bpo-37157 is for reflink / CoW copy, this one is not.
bpo-37157 is for reflink / CoW copy, this one is not.
Oh sorry, it seems like I misunderstood copy_file_range(). So it doesn't use/support CoW?
Nope, it doesn't (see man page). We can simply use FICLONE (cp does the same).
According to the man page of copy_file_range (https://man7.org/linux/man-pages/man2/copy_file_range.2.html), copy_file_range also should support copy-on-write:
copy_file_range() gives filesystems an opportunity to implement "copy acceleration" techniques, such as the use of reflinks (i.e., two or more inodes that share pointers to the same copy- on-write disk blocks) or server-side-copy (in the case of NFS).
Is this wrong?
However, while researching more about FICLONE vs copy_file_range, I found e.g. this: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24399
Which suggests that there are other problems with copy_file_range?
FYI, GNU Coreutils 9.0 (released in September 2021) changed cp
to:
copy_file_range
where available;https://lists.gnu.org/archive/html/info-gnu/2021-09/msg00010.html
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['library', '3.9', 'performance']
title = 'Use copy_file_range() in shutil.copyfile() (server-side copy)'
updated_at =
user = 'https://github.com/giampaolo'
```
bugs.python.org fields:
```python
activity =
actor = 'Albert.Zeyer'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation =
creator = 'giampaolo.rodola'
dependencies = []
files = ['48392']
hgrepos = []
issue_num = 37159
keywords = ['patch']
message_count = 6.0
messages = ['344671', '344679', '344680', '344691', '344693', '383996']
nosy_count = 11.0
nosy_names = ['facundobatista', 'ncoghlan', 'vstinner', 'giampaolo.rodola', 'StyXman', 'petr.viktorin', 'neologix', 'Albert.Zeyer', 'martin.panter', 'desbma', 'pablogsal']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue37159'
versions = ['Python 3.9']
```
Linked PRs