danielpclark / faster_path

Faster Pathname handling for Ruby written in Rust
MIT License
782 stars 33 forks source link

Naive entries implementation #38

Closed julienXX closed 7 years ago

julienXX commented 8 years ago

References #11.

Slower than the ruby one :( @danielpclark if you have any optimisation in mind I'm all ears :)

gogainda commented 8 years ago

@julienXX What the numbers?

julienXX commented 8 years ago
bench_rust_entries   0.842141    0.849323    0.831712    0.844437    0.857011
 PASS (4.26s) :: bench_rust_entries
bench_ruby_entries   0.558785    0.549958    0.551315    0.567534    0.536545
 PASS (2.80s) :: bench_ruby_entries
julienXX commented 8 years ago

If I don't convert the strings to Pathnames in faster_path.rb I get:

bench_rust_entries   0.584481    0.576329    0.585669    0.560661    0.581034
 PASS (2.93s) :: bench_rust_entries
bench_ruby_entries   0.555329    0.552907    0.539050    0.548887    0.542997
 PASS (2.77s) :: bench_ruby_entries
gogainda commented 8 years ago

@julienXX the latest numbers very promising, btw

julienXX commented 8 years ago

@danielpclark @gogainda regarding #62 this one seems slower. Any idea on what I might be doing wrong here?

gogainda commented 8 years ago

Seems that any fancy methods such as

fs::read_dir(Path::new(r_str))

are kind of slow, I had to reinvent the wheel and operate only o array of chars in order to make implementation faster. Try different approaches and measure the performance, no other way around

julienXX commented 8 years ago

@gogainda thanks :) Did you get much better results than the original C impl?

danielpclark commented 8 years ago

This will need to be cleaned. These git pushes were removed from here 0172e7c, 4cbb945, 585ea2f, 884be8d

It turns out that the method only returned nil which is why it was 75% faster. The weird thing was that all the tests passed locally. That still has to be looked into. Just know that the code base was reverted to 0.1.8

gogainda commented 8 years ago

In my opinion, if c impl was writter with performance in mind ther is no way that rust impl would be faster. In reallity, not every c imp of ruby method is very fast, especially in edge cases(e.g. very long input values). So, yes, it is possible

danielpclark commented 8 years ago

Change vec![]; to Vec::with_capacity(usize);. Using a Vector without setting the capacity is slow.

I haven't looked into a more optimal way to handle Ruby arrays but I have a theory about using a tuple. If we think that Vector has more overhead and we want to get more down to the metal than that then we think about creating a collection from a fixed size. I haven't tested the different collection types for speed and performance.

At the point you do fs::read_dir you now have enough information for the collection size. What type does the fs::read_dir return anyways?

danielpclark commented 8 years ago

FYI If you don't map to Pathname.new then you will achieve equivalent performance as Pathname#entries with your current implementation.

julienXX commented 8 years ago

@danielpclark saw that too, would it be okay not converting to Pathnames?

danielpclark commented 8 years ago

In FasterPath module itself the plain strings are preferred. But for monkey-patching this would be considered a method regression. policy on method regressions - with the regression flag with can have the map Pathname.new on the monkey patch & refinements themselves.

danielpclark commented 8 years ago

We can always try to improve upon our own code base too.

julienXX commented 8 years ago

Oh that's neat, I had not seen the regression policy. I'll clean the PR a bit then.

julienXX commented 8 years ago

@danielpclark I cleaned the PR, could find a way to get let files = fs::read_dir(r_str).unwrap(); size though; so the vec has no capacity. It's 16% slower in pbench results so I added the regression flag.

danielpclark commented 8 years ago

@julienXX fs::read_dir has the method size_hint() But when I tried it for Vec::with_capacity() it seemed slower.

danielpclark commented 7 years ago

The performance issue with this is negligible. I'm merging this in. I've noticed a scaling performance issue with this. But I don't think it's related to this code #94 . You've appropriately used the WITH_REGRESSION flag so I'm merging this.