twingly / twingly-search-api-ruby

:gem: Twingly Blog Search API in Ruby
https://developer.twingly.com/
MIT License
4 stars 0 forks source link

Low hanging fruit that would reduce memory consumption #75

Closed jage closed 5 years ago

jage commented 6 years ago

First step is to do some profiling.

jage commented 6 years ago

First step is to do some profiling.

Played around with memory_profiler a bit and found:

Should be combined with rubocop to enforce it and memory_profiler to prove it.

I did some experiments and it looks very promising, we have some paths which creates a new string for every XML-tag on every post (20 tags, up to 10 000 posts).

Example of changes:

diff --git a/lib/twingly/search/parser.rb b/lib/twingly/search/parser.rb
index ded67d1..648873f 100644
--- a/lib/twingly/search/parser.rb
+++ b/lib/twingly/search/parser.rb
@@ -1,3 +1,4 @@
+# frozen_string_literal: true
 require 'nokogiri'

 module Twingly
@@ -43,9 +44,10 @@ module Twingly
       def parse_post(element)
         post_params = {}
         element.element_children.each do |child|
-          post_params[child.name] =
-            case child.name
-            when *%w(tags links images)
+          name = child.name.freeze
+          post_params[name] =
+            case name
+            when "tags", "links", "images"
               parse_array(child)
             when "coordinates"
               parse_coordinates(child)
diff --git a/lib/twingly/search/post.rb b/lib/twingly/search/post.rb
index 4ac1828..039e104 100644
--- a/lib/twingly/search/post.rb
+++ b/lib/twingly/search/post.rb
@@ -1,3 +1,4 @@
+# frozen_string_literal: true
 # encoding: utf-8

 module Twingly
@@ -46,10 +47,10 @@ module Twingly
         @text          = params.fetch('text')
         @language_code = params.fetch('languageCode')
         @location_code = params.fetch('locationCode')
-        @coordinates   = params.fetch('coordinates', {})
-        @links         = params.fetch('links', [])
-        @tags          = params.fetch('tags', [])
-        @images        = params.fetch('images', [])
+        @coordinates   = params.fetch('coordinates') { {} }
+        @links         = params.fetch('links')  { [] }
+        @tags          = params.fetch('tags')   { [] }
+        @images        = params.fetch('images') { [] }
         @indexed_at    = Time.parse(params.fetch('indexedAt'))
         @published_at  = Time.parse(params.fetch('publishedAt'))
         @reindexed_at  = Time.parse(params.fetch('reindexedAt'))