Closed gschlager closed 11 months ago
The only thing I'm not clear about is the nested
flag. What is its purpose?
The nested
flag is a breaking change. Currently it's possible to bind named parameters from nested hashes.
The following works because it binds y
and z
from the nested hash.
db.query('SELECT :x AS x, :y AS y, :z AS z', { x: 1, more: { y: 2, z: 3 } })
# => [{ x: 1, y: 2, z: 3 }]
I think that's an unexpected and undocumented behavior. I'd prefer if the previous code raises an error because it can't bind the more
key. It would be better if developers could define how nested objects are bound.
For example, the following code would bind more
as JSON string.
db.on_unhandled_parameter = ->(value) do
case value
when Hash
value.to_json
else
raise "don't know how to handle #{value.class}"
end
end
db.query('SELECT :x AS x, :more AS more', { x: 1, more: { y: 2, z: 3 } })
# => [{ x: 1, more: '{"y":2,"z":3}' }]
The original behavior can still be achieved by slightly changing how parameters are supplied.
db.query('SELECT :x AS x, :y AS y, :z AS z', { x: 1 }, { y: 2, z: 3 })
# => [{ x: 1, y: 2, z: 3 }]
db.query('SELECT :x AS x, :y AS y, :z AS z', { x: 1, y: 2, z: 3 })
# => [{ x: 1, y: 2, z: 3 }]
If we don't want this to be a breaking change, then I guess we could allow binding of nested objects (hash, struct) unless a on_unhandled_parameter
block is supplied.
What do you think?
I think that's an unexpected and undocumented behavior. I'd prefer if the previous code raises an error because it can't bind the
more
key. It would be better if developers could define how nested objects are bound.
Indeed that's an unintended behaviour.
For example, the following code would bind
more
as JSON string. ...> db.query('SELECT :x AS x, :more AS more', { x: 1, more: { y: 2, z: 3 } }) # => [{ x: 1, more: '{"y":2,"z":3}' }]
Personally I prefer being more explicit at the callsite, i.e. not using on_unhandled_parameter
:
> db.query('SELECT :x AS x, :more AS more', { x: 1, more: { y: 2, z: 3 }.to_json })
> # => [{ x: 1, more: '{"y":2,"z":3}' }]
The original behavior can still be achieved by slightly changing how parameters are supplied.
I don't mind breaking the behaviour for nested hashes.
Thinking about it some more, I think it would be beneficial if we came up with more use cases. The JSON one is to me not so interesting, as it can be done very easily at the callsite (see my previous comment).
I think a more interesting use case is where you use a custom object to supply parameters, for example an ORM instance:
foo_instance = Foo.new(x: 1, y: 2, z: 3)
db.query('insert into foo values (:x, :y, :z)', foo_instance)
To make this possible we can do one of the following:
db.on_unhandled_parameter { |v| v.is_a?(Foo) ? foo.values : (raise "Invalid parameter") }
#to_h
on the object:
class Foo
def to_h
@values
end
end
The first approach is more flexible. The second approach is simpler and relies on a well-known standard Ruby API that should already exist on container objects. There's pros and cons for each approach. But maybe before deciding which approach is better for us we should come up with more use cases. What do you think?
My main use case are Date
and Time
objects as shown in my PR description. Currently I have many places were data is generated and I don't want format the dates and times as strings in all those places. The requirement to convert them into a different data type (TEXT, INTEGER or REAL) seems like an implementation detail that should be hidden from the caller. So, my choices are to either add an intermediate layer that does the transformation or to transform the values when the parameters are bound. I prefer the latter.
Calling to_h
on objects that implement it seems like a good default.
Calling
to_h
on objects that implement it seems like a good default.
I'm working on code to do that. Will push momentarily.
I've also added tests for nested hashes and structs, which at the moment fail, as they expect a ParameterError
to be raised. If nested hashes are not supported, don't we need to raise an exception?
Thanks for your feedback and your code. I'll have a closer look tomorrow.
I think the main problem here is that if you pass a hash as a parameter, it can be interpreted as either an object that needs to be transformed in order to be used as a positional parameter, or as a set of named parameters:
hash = { foo: 1, bar: 2 }
db.query('select ?', hash) # hash should be converted using unhandled_parameter_proc
db.query('select :foo, :bar', hash) # hash treated as bunch of named parameters
In order to remove this ambiguity, we can do a simple heuristic where we scan the sql for ?
, which means that parameters are treated as positional. Otherwise they are treated as named. Then again, this introduces a certain complexity and may lead to subtle bugs for users of the library that will be hard to understand.
Here's an idea that might solve the problems we're encountering with this PR:
Instead of transforming just the parameters Extralite doesn't recognize, provide an API to transform all of them:
db.parameter_transform do |v|
case v
when Time
v.to_i
when Hash
v.to_json
else
v
end
end
This proc is then fed whatever is passed to Database#query
:
db.query_single_value('select ?', Time.now) #=> 1703195102
db.query_single_value('select ?', foo: 1) #=> '{ "foo": 1 }'
db.query_single_value('select ?', 'foo') #=> 'foo'
When defined, the result of the parameter_transform
proc is bound normally as is done now, including support for hashes for expressing named parameters. Note that in the example above you can't use named parameters, since the given hash is always transformed to JSON:
db.query_single_value('select :foo', foo: 1) #=> '{ "foo": 1 }'
But the following is possible:
db.parameter_transform do |v|
case v
when Time
v.to_i
else
v # if a hash is returned it is interpreted as a set of named parameters
end
end
db.query_single_value('select ?', Time.now) #=> 1703195102
db.query_single_value('select :foo', foo: 1) #=> 1
Some possible solutions to mixing together named parameters and transforming hashes to JSON:
hash.to_json
when providing it as a parameter.#parameter_transform
.#parameter_transform
.I know this is not ideal either, but it seems to me a much simpler solution that has a more consistent behaviour. I'll try to put together a separate PR for this idea.
I was trying to avoid calling a proc for every parameter for performance reasons, but I didn't benchmark it, so maybe it's not as bad as I thought it is. I guess we can give it a try.
I'm closing this. It's not worth the trouble and complexity. I'm probably going to implement it in a lightweight DB layer for my usecase. The less complex alternative in https://github.com/digital-fabric/extralite/pull/45 adds a performance penaltiy to every parameter binding which is even worse for performance.
@noteflakes Thanks for inviting me as collaborator. Still, I'd like to get your feedback before I merge something like this PR. Happy to make changes to it. I still need to write tests for prepared statements and update the README.
This will make it easier to bind values like
Date
orTime
by supplying a block that gets executed when you are trying to bind an unhandled parameter type.