danielpclark / faster_path

Faster Pathname handling for Ruby written in Rust
MIT License
781 stars 29 forks source link

Fixes a couple of extname edge cases #79

Closed glebm closed 8 years ago

glebm commented 8 years ago

Also adds a commmented out test for a failing case NOT resolved by this PR.

$ rake pbench
Pinch-bench (Pbench) by Daniel P. Clark
--------------------------------------------------------------------------------
64-bit ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]
64-bit rustc 1.9.0 (e4e8b6668 2016-05-18)
--------------------------------------------------------------------------------
Performance change for allocate, instead of new, is 80.6%
Performance change for absolute? is 92.8%
Performance change for add_trailing_separator is 55.4%
Performance change for basename is 1.3%
Performance change for chop_basename is 22.3%
Performance change for directory? is 32.8%
Performance change for extname is 49.6%
Performance change for has_trailing_separator? is 45.8%
Performance change for blank? (verses strip.empty?) is -89.1%
Performance change for relative? is 92.0%
Started with run options --seed 17737
gogainda commented 8 years ago

Is there difference between this PR and master of extname performance wise?

danielpclark commented 8 years ago

I don't know how your result says you have a positive basename result. I have the same language stats you're showing and mine says it's -40%.

glebm commented 8 years ago

Microbenchmarks like this heavily depend on the CPU cache size.

$ cat /proc/cpuinfo 

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 60
model name  : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
stepping    : 3
microcode   : 0x1c
cpu MHz     : 4078.281
cache size  : 8192 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 4
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
bugs        :
bogomips    : 7999.71
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

...
# output for 3 more cores follows (I have HT disabled).
danielpclark commented 8 years ago

You cache is 4x mine. I find it odd that I have 4 cores but this reports 2 and displays 4 separate outputs for the 4

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 21
model       : 1
model name  : AMD FX(tm)-4100 Quad-Core Processor
stepping    : 2
microcode   : 0x6000613
cpu MHz     : 1400.000
cache size  : 2048 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 2
apicid      : 16
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bogomips    : 7232.18
TLB size    : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb