active-hash / active_hash

A readonly ActiveRecord-esque base class that lets you use a hash, a Yaml file or a custom file as the datasource
MIT License
1.2k stars 179 forks source link

Speed up automatic id definition #231

Closed braktar closed 1 month ago

braktar commented 3 years ago

In our use case, we have a large amount of data with no id. It generates a performance issue due to the comparison of all the ids to find the maximum.

To fix it, we store the current maximum, and only perform the comparison to this value during the insertion.

senhalil commented 3 years ago

When the warnings are enabled, set_id spits a message because of @max_id.

I don't know which gem has changed the defaults but after a bundle update rake test gives the following error on our optimizer-api repository. Screenshot from 2021-07-20 16-54-35

I suggest the following modification

      def set_id(record)
        # sets record[:id] to @max_id+1 if it doesn't exist
        @max_id ||= 0
        if record[:id] && record[:id].is_a?(Numeric)
          @max_id = [@max_id, record[:id].ceil].max
        else
          record[:id] ||= (@max_id = @max_id.succ)
        end
      end
kbrock commented 1 month ago

We should not be generating ids or values in Active Hash, so we're going to avoid optimizing and/or encouraging that behavior.

I think this behavior does meets your needs well. So you may want to add these 2 methods to a module and prepend in your class.