active-hash / active_hash

A readonly ActiveRecord-esque base class that lets you use a hash, a Yaml file or a custom file as the datasource
MIT License
1.2k stars 179 forks source link

QUESTION: What's the performance of "find_by"? #244

Closed rodricios closed 2 years ago

rodricios commented 2 years ago

Say we have a simple yaml file like so:

- name: Bob
- name: John
- name: Mary

And we perform a query on the "Person" ActiveHash model that wraps that yaml file like so:

p = Person.find_by(name: 'Mary')

Are we doing linear searches to find the correct entry or is there an internal index/hash map built on the fields (like name) of each entry that we query against?

flavorjones commented 2 years ago

ActiveHash::Relation is where this code lives:

https://github.com/active-hash/active_hash/blob/9e0e4d5368bb096c81fa2685b50552051302e312/lib/active_hash/relation.rb#L132-L134

When the query is being resolved, the set of records is iterated over once, and records matching the query are selected -- so this is an O(n) linear-time search.

Does that explanation make sense?

rodricios commented 2 years ago

@flavorjones, thank you!