Explanation - Githubissues

Svetlitski / fcp

A significantly faster alternative to the classic Unix cp(1) command, copying large files and directories in a fraction of the time.

BSD 3-Clause "New" or "Revised" License

767 stars 19 forks source link

Explanation #2

Closed AgainPsychoX closed 3 years ago

AgainPsychoX commented 3 years ago

Can you explain why is it so much faster?

Why this method not in kernel/base linux or other systems?

baverman commented 3 years ago

It's very shady that author did not mention copy-on-write anywhere in readme.

stilgarpl commented 3 years ago

I'd like to know that too. Copy is a simple operation, just read and write, so what does fcp do to make it faster?

vibecoder commented 3 years ago

Personally, I like to know the details on the reason it is faster other than the performance benchmarks on the README. Otherwise one can always study the code I guess. Also anyone knows what file-system is being used here for the benchmarks?

vibecoder commented 3 years ago

There is some discussion that took place on the hackernews regarding the same https://news.ycombinator.com/item?id=27523014

Svetlitski commented 3 years ago

@AgainPsychoX The primary reason fcp is faster than cp is that it uses multiple threads to walk directories and issue IO requests in parallel, which is advantageous for performance on systems with SSDs. My guess as to why cp doesn't do this is that cp was written back when magnetic hard-drives were the norm, and issuing a large number of IO requests for disparate parts of a hard disk is quite bad for performance as it causes the drive head to seek large distances across the disk.

Svetlitski commented 3 years ago

It's very shady that author did not mention copy-on-write anywhere in readme.

Hello @baverman,

Copy-on-write is mentioned implicitly in the footnote explaining the large performance difference on the "Large Files" benchmark run on macOS, which mentions the fclonefileat and fcopyfile syscalls. I can totally see how this is easy to miss though as it appears only in the footnote, and requires the reader to understand that those syscalls perform copy-on-write.

Svetlitski commented 3 years ago

Personally, I like to know the details on the reason it is faster other than the performance benchmarks on the README. Otherwise one can always study the code I guess. Also anyone knows what file-system is being used here for the benchmarks?

See my response above and some of the discussion on hackernews than you linked for an explanation as to the performance. The filesystem used for the Linux benchmarks was xfs.

Svetlitski commented 3 years ago

An explanation for fcp's high-performance has been added to the README under the "Methodology" section.