This PR contains all the improvements I made from profiling the JSON::LD::API#frame method.
If all commits from this PR and sister PR https://github.com/ruby-rdf/rdf/pull/360 are applied, the number of objects allocated in my sample scenario drops around 32% and the runtime is improved by 20% (on Ruby 2.4, which is recommended because of the frozen string literal feature).
I would appreciate if you can do some code review before accepting the PR @gkellogg .
The details
I did not have the chance to run the whole json-ld test suite (broken since #40) but at least for each commit I verified that the CI=true bundle exec rake spec command passes, and that there is no regression in the ntriple representation of the graph that I have been framing.
This PR was developed using a sample scenario (ntriples graph that we want to frame to a certain context). The scenario was run multiple times (here 10) and I kept track of ruby object allocations using stackprof.
The sample scenario run 10 times would allocate 906000 objects at the start, and at the end only 623000, a reduction of around 32% of allocated objects. The run time was cut by around 20% (on ruby 2.4).
Here was my progression in reducing the allocated object count (each step should have a corresponding commit in this PR):
starting with 906120 objects allocated.
json-ld:
[+1k] replace KEYWORDS=%w[] with a Set (to optimize multiple .include calls) => 907840
[-64k] rework Array#kw_sort to use a cache of KW_ORDERS for faster lookups and less object allocations => 843600
[-24k] remplace mystring[0,2] == '//' with mystring.start_with?('//') => 819770
[-10k] remove 1 line of expensive unused code: keys = ordered ? input.keys.kw_sort : input.keys => 809580
[-19k] hunt use of .keys.reject .keys.select, keys.inject ... => 745730
[-29k] hunt use of %w().include? and ].include? => 716860
[-3k] hunt use of == [], == {}, != [], != {} , replace with .empty? predicate => 713610
[-10k] replace (object.keys & %w[value1 value2]).length < 2 with if !(object.key?(value1) && object.key?(value2)) => 703930
[-1k] refactor .reject.each and .reject.map to not allocate temporary array (thanks 'next if') => 702900
[-10k] refactor output_object.delete('@type') if Array(output_object['@type']).join('').to_s.empty? en output_object.delete('@type') if output_object.key?('@type') && output_object['@type'].nil? => 692020
[-8k] replace all use of .inject with more efficient .each_with_object => 684590
[-7k] replace map..flatten.compact instance with .each_with_object => 677990
[-0k] replace mapping -= %w('@set') with mapping.delete('@set') => 677990
[-0k] replace .dup.delete_if with .reject => 677990
[-16k] replace defined: {} in function keyword args with defined: nil followed with defined = defined || {} because parameter seldom used => 661760
[-4k] refactor JSON::LD::Frame#count_blank_node_identifiers to allocate results hash only once => 657870
rdf:
[-6k] replace matchdata.to_a[1..-1] with matchdata[1..-1] => 651111
[-18k] replace def method(*args) if (first=args.first) ... with def method(first, *args) => 633651
[-7k] replace property.to_s =~ /regexp/ with constant regexp and REGEXP.match(property) (regexp.match cares about casting nil and symbol to String) => 626961
[-3k] reduce unneeded object allocations in RDF::Utils::Logger#logger_common => 623851
Coverage increased (+0.1%) to 89.07% when pulling 178afc2c014d69db75bd9447ed3cb166f98ee055 on PerfectMemory:performance-fixes into a1a01db875afbb8759b0cac10b7d1aa73034a3d8 on ruby-rdf:develop.
Coverage increased (+0.1%) to 89.07% when pulling 178afc2c014d69db75bd9447ed3cb166f98ee055 on PerfectMemory:performance-fixes into a1a01db875afbb8759b0cac10b7d1aa73034a3d8 on ruby-rdf:develop.
Coverage increased (+0.1%) to 89.07% when pulling 178afc2c014d69db75bd9447ed3cb166f98ee055 on PerfectMemory:performance-fixes into a1a01db875afbb8759b0cac10b7d1aa73034a3d8 on ruby-rdf:develop.
Summary
This PR contains all the improvements I made from profiling the
JSON::LD::API#frame
method.If all commits from this PR and sister PR https://github.com/ruby-rdf/rdf/pull/360 are applied, the number of objects allocated in my sample scenario drops around 32% and the runtime is improved by 20% (on Ruby 2.4, which is recommended because of the frozen string literal feature).
I would appreciate if you can do some code review before accepting the PR @gkellogg .
The details
I did not have the chance to run the whole json-ld test suite (broken since #40) but at least for each commit I verified that the
CI=true bundle exec rake spec
command passes, and that there is no regression in the ntriple representation of the graph that I have been framing.This PR was developed using a sample scenario (ntriples graph that we want to frame to a certain context). The scenario was run multiple times (here 10) and I kept track of ruby object allocations using stackprof.
The sample scenario run 10 times would allocate 906000 objects at the start, and at the end only 623000, a reduction of around 32% of allocated objects. The run time was cut by around 20% (on ruby 2.4).
Here was my progression in reducing the allocated object count (each step should have a corresponding commit in this PR):
mystring[0,2] == '//'
withmystring.start_with?('//')
=> 819770keys = ordered ? input.keys.kw_sort : input.keys
=> 809580%w().include?
and].include?
=> 716860== []
,== {}
,!= []
,!= {}
, replace with .empty? predicate => 713610(object.keys & %w[value1 value2]).length < 2
withif !(object.key?(value1) && object.key?(value2))
=> 703930.reject.each
and.reject.map
to not allocate temporary array (thanks 'next if') => 702900output_object.delete('@type') if Array(output_object['@type']).join('').to_s.empty?
enoutput_object.delete('@type') if output_object.key?('@type') && output_object['@type'].nil?
=> 692020.inject
with more efficient.each_with_object
=> 684590map..flatten.compact
instance with.each_with_object
=> 677990mapping -= %w('@set')
withmapping.delete('@set')
=> 677990.dup.delete_if
with.reject
=> 677990defined: {}
in function keyword args withdefined: nil
followed withdefined = defined || {}
because parameter seldom used => 661760JSON::LD::Frame#count_blank_node_identifiers
to allocate results hash only once => 657870matchdata.to_a[1..-1]
withmatchdata[1..-1]
=> 651111def method(*args) if (first=args.first) ...
withdef method(first, *args)
=> 633651property.to_s =~ /regexp/
with constant regexp andREGEXP.match(property)
(regexp.match cares about casting nil and symbol to String) => 626961