danielpclark / faster_path

Faster Pathname handling for Ruby written in Rust
MIT License
782 stars 33 forks source link

Improve `basename` and `dirname` #151

Closed glebm closed 6 years ago

glebm commented 6 years ago

A faster and more readable implementation of basename and dirname.

basename_benchmark dirname_benchmark

danielpclark commented 6 years ago

Could you also update the README with the PinchBench performance values and remove the REGRESSION section. Depending, this might call for removing the WITH_REGRESSION flags altogether (Travis CI).

glebm commented 6 years ago

@danielpclark Is it OK if I update the Readme after this is merged? Because #153 is based on top of this.

danielpclark commented 6 years ago

Okay. Please also keep in mind for future updates we plan to fully utilize config flags for anything Windows OS specific.

glebm commented 6 years ago

Yes, that makes sense, which is why I didn't attempt any Windows compatibility here. I looked at the Ruby source, and the Windows version is very different, we'll need a separate implementation for dirname and possibly basename.

danielpclark commented 6 years ago

My system is saying basename is slower

Performance change for basename is -19.7%
Performance change for basename is -14.9%
Performance change for basename is -18.2%

A full grab:

Pinch-bench (Pbench) by Daniel P. Clark
--------------------------------------------------------------------------------
64-bit ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-linux]
64-bit rustc 1.24.1 (d3ae9a9e0 2018-02-27)
architecture: x86_64-unknown-linux-gnu
--------------------------------------------------------------------------------
Performance change for allocate, instead of new, is 83.1%
Performance change for absolute? is 95.2%
Performance change for add_trailing_separator is 28.3%
Performance change for basename is -18.2%
Performance change for children is 9.6%
Performance change for children_compat is -10.3%
Performance change for chop_basename is 77.2%
Performance change for cleanpath_aggressive is 74.1%
Performance change for cleanpath_conservative is 74.1%
Performance change for del_trailing_separator is 84.1%
Performance change for directory? is 8.6%
Performance change for dirname is 20.9%
Performance change for entries is 8.8%
Performance change for entries_compat is -43.2%
Performance change for extname is 50.4%
Performance change for has_trailing_separator? is 78.6%
Performance change for join is 66.5%
Performance change for plus is 80.8%
Performance change for relative? is 83.1%
Performance change for relative_path_from is 71.1%
Started with run options --seed 39406

And a regular benchmark:

BasenameBenchmark
bench_rust_basename  0.050958    0.098202    0.147081    0.194133    0.249651
 PASS (0.77s) :: bench_rust_basename
bench_ruby_basename  0.042795    0.084585    0.127133    0.172803    0.213270
 PASS (0.67s) :: bench_ruby_basename

Where higher is worse.

glebm commented 6 years ago

That's weird, does it also happen in a longer benchmark?

danielpclark commented 6 years ago

Yes. It's pretty consistent on my machine at about 15%-ish slower.

I think basename can eventually be refactored to look like extname. I have a pattern in my mind for it.

Since the performance seems more majoritively faster on machines other than mine I'm leaning towards allowing it in the next gem release.

glebm commented 6 years ago

A wild guess, could it be that your version of libc has an inferior memrchr? What distribution and version of Linux do you use?

danielpclark commented 6 years ago

lsb_release --all

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.10
Release:    16.10
Codename:   yakkety

apt show libc6

Package: libc6
Version: 2.24-3ubuntu2.2
Status: install ok installed
Priority: required
Section: libs
Source: glibc
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: GNU Libc Maintainers <debian-glibc@lists.debian.org>
Installed-Size: 11.2 MB
Depends: libgcc1
danielpclark commented 6 years ago

What version of libc are you rocking? :rocket: :wink:

glebm commented 6 years ago

I'm on Ubuntu 16.04 and an older version of libc!

Package: libc6
Version: 2.23-0ubuntu10
Priority: required
Section: libs
Source: glibc
Origin: Ubuntu
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: GNU Libc Maintainers <debian-glibc@lists.debian.org>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 11.2 MB
Depends: libgcc1
Suggests: glibc-doc, debconf | debconf-2.0, locales
Breaks: hurd (<< 1:0.5.git20140203-1), libtirpc1 (<< 0.2.3), locales (<< 2.23), locales-all (<< 2.23), lsb-core (<= 3.2-27), nscd (<< 2.23)
Replaces: libc6-amd64
Homepage: http://www.gnu.org/software/libc/libc.html
Task: minimal
Supported: 5y
Download-Size: 2,580 kB
APT-Manual-Installed: yes
APT-Sources: http://gb.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
Description: GNU C Library: Shared libraries
 Contains the standard libraries that are used by nearly all programs on
 the system. This package includes shared versions of the standard C
 library and the standard math library, as well as many others.

N: There is 1 additional record. Please use the '-a' switch to see it
danielpclark commented 6 years ago

I wonder if we can add the libc version to the rake system info command we have. Perhaps a Rust method to call libc's version directly? I don't know if they have that written in the library, but I would expect so.