digital-fabric / extralite

Ruby on SQLite
http://www.rubydoc.info/gems/extralite
MIT License
247 stars 7 forks source link

Custom transforms for creating arbitrarily complex POROs with identity maps #76

Open noteflakes opened 3 weeks ago

noteflakes commented 3 weeks ago

Right now Extralite supports custom transforms by providing a transform proc that takes the raw values and can return custom objects. A related idea is the ability to turn query results into arbitrarily complex PORO objects, with support for identity maps. This can be especially useful when performing joins where we want to turn the results into a bunch of related objects.

Some preliminary work towards this goal was done in commit bbcf12eb.

Example query:

SELECT books.id, books.title, authors.id, authors.name
  FROM books LEFT OUTER JOIN authors ON authors.id = books.author_id

What we want is an identity map expressing relations as nested objects:

# The author object is the same for both book objects.
[
  {
    id: 1,
    title: "foo",
    author: {
      id: 1,
      name: "bar"
    }
  },
  {
    id: 2,
    title: "baz",
    author: {
      id: 1,
      name: "bar"
    }
  }
]

The projection transform is expressed using a DSL:

transform = Extralite.project do
  # the identity modifier is used to express which field is used as key to the identity map
  id.identity
  title
  author {
    id.identity
    name
  }
end

transform.class #=> Extralite::Transform

books = DB.query(transform, sql)

So instead of passing a proc, we pass a transform object. The transform is performed at the C-level, accompanied by identity maps for both book and author objects.

An important limitation for this is that if identity maps are used, the entire result set must be converted before any rows are returned to the application code.

noteflakes commented 3 weeks ago

Internal reperesentation of the transform:

SPEC = {
  identity_idx: 0,
  columns: [
    :id,
    :title,
    {
      name: :author,
      dentity_idx: 0,
      columns: [:id, :name]
    }
  ]
}
noteflakes commented 3 weeks ago

Algorithm (without identity map):

class Transform
  def process(row_hash, spec)
    spec[:columns].each do |c|
      case c
      when Hash
        relation_hash = row_hash[c[:name]] = {}
        process(relation_hash, c)
      else
        row_hash[c] = get_next_column_value
      end
    end
  end
end
noteflakes commented 3 weeks ago

With identity map:

def prepare_spec(spec)
  if spec[:identity_idx]
    spec[:id_map] = {}
  end
  spec[:columns].each do |c|
    prepare_spec(c) if c.is_a?(Hash)
  end
end

def process(spec)
  id = get_column_value(spec[:identity_idx])
  if (row = spec[:id_map][id])
    return row
  end

  row = {}
  populate_row(row, spec)
  row
end

def populate_row(row, spec)
  spec[:columns].each_with_index do |c, i|
    case c
    when Hash
      row[c[:name]] = process(c)
    else
      value = row[c] = get_next_column_value
      if i == spec[:identity_idx]
        spec[:id_map][value] = row
      end
    end
  end
end
noteflakes commented 3 weeks ago

Working example here: https://gist.github.com/noteflakes/3653ec742d76fe5b6e3557315fe8f503

I'm really pleased with how relatively simple it is. Remains to be seen how useful that can be... There are some additional features to add:

noteflakes commented 2 weeks ago

Some more possibilities:

# custom value conversion
t = Extralite.transform do
  stamp.convert(Time) # Time.at(v)
  value.convert(Float) # v.to_f
  flag.convert(:bool) # as boolean
  flag.convert(MyClass) # MyClass.new(v)
  tag.convert { |v| "(#{v})" } # custom conversion proc
end

# return identity map instead of array
t = Extralite.transform do
  emit_map
  id.identity
  name
end

# one_to_many relationship
t = Extralite.transform do
  id.identity
  title
  tags [{
    id.identity
    name
  }]
end

# one_to_many emitted as single values
t = Extralite.transform do
  id.identity
  title
  tags [{
    emit_single_value name
    id.identity
    name
  }]
end