Closed byucesoy closed 1 year ago
Not sure about the use of Marshal.dump & load. Any other way of doing deep copy should work fine.
I am OK with the general idea of making sure the default values are unique and not shared (unless frozen). The issue with this implementation is it causes a slowdown which in the vast majority of cases is not needed.
I think a better approach in this case would be to have the default value be a proc that returns a unique object. That would probably require a change to the pg_json
extension instead of the defaults_setter
extension.
That would fix my particular problem, but I'm afraid this would resurface in the future for another complex type. I didn't try yet, but I suspect pg_range would also have similar problem. There could be others as well. So, I think this needs to be fixed in defaults_setter.
Are you concerned about unnecessary overhead of deep copying simple types? For those, I expect dump/load to be quite fast, still we can have an if else statement to only run deep copy logic for complex types (checking for Array and Hash seems like enough). Or are you concerned about overhead of deep copying complex types? I think correct behavior is deep copying those even though it is slow. Running Model.new and then updating a column is quite common, which breaks next Model.new for complex types.
pg_hstore
and pg_row
would need similar behavior. pg_range
shouldn't, as ranges should be immutable (at least, there isn't a supported API for mutation).
My concern is that defaults_setter
already has support for separate values per instance, and that is having the default object respond to call
(typically, a proc), and the object is called to get the default value. I don't want to introduce a separate and less efficient way to handle separate values per instance. It's possible the proc used can use Marshal.load
(only need to dump on model setup/dataset change, not every call).
Just to be clear, my goal is not implementing separate default values per instance. I want to have single default value. The problem is if one instance is updated by caller (which is quite common), it also changes the default value. Thus next calls of Model.new uses updated value not the original default value.
My expectation would be Model.new to always come up with the value given as default. In our case we have a model called Strand. We set the default value like this;
Strand.plugin :defaults_setter, cache: true
Strand.default_values[:stack] = [{}]
Then at some point in the code we do something like this;
s1 = Strand.new
s1.stack << "This shouldn't exist in other Strands"
Then any new calls to Strand.new.stack returns [{}, "This shouldn't exist in other Strands"]
instead of the default value [{}]
.
Just to be clear, my goal is not implementing separate default values per instance. I want to have single default value. The problem is if one instance is updated by caller (which is quite common), it also changes the default value. Thus next calls of Model.new uses updated value not the original default value.
This is what is meant by separate default values per instance (each model instance has its own instance of the default value).
My expectation would be Model.new to always come up with the value given as default. In our case we have a model called Strand. We set the default value like this;
Strand.plugin :defaults_setter, cache: true Strand.default_values[:stack] = [{}]
Then at some point in the code we do something like this;
s1 = Strand.new s1.stack << "This shouldn't exist in other Strands"
Then any new calls to Strand.new.stack returns
[{}, "This shouldn't exist in other Strands"]
instead of the default value[{}]
.
The way to implement this behavior in Sequel is:
Strand.default_values[:stack] = proc{[{}]}`
We could potentially handle this in the defaults_setter
instead of in individual extensions, but the change should be in convert_default_value
, to use a proc for Hash
/Array
(and DelegateClasses of each).
Can you try the following and see if it works for you?
diff --git a/lib/sequel/plugins/defaults_setter.rb b/lib/sequel/plugins/defaults_setter.rb
index b20b47a5f..99bcff8d7 100644
--- a/lib/sequel/plugins/defaults_setter.rb
+++ b/lib/sequel/plugins/defaults_setter.rb
@@ -1,5 +1,7 @@
# frozen-string-literal: true
+require 'delegate'
+
module Sequel
module Plugins
# The defaults_setter plugin makes the column getter methods return the default
@@ -106,6 +108,18 @@ module Sequel
lambda{Date.today}
when Sequel::CURRENT_TIMESTAMP
lambda{dataset.current_datetime}
+ when Hash, Array
+ v = Marshal.dump(v).freeze
+ lambda{Marshal.load(v)}
+ when Delegator
+ klass = v.class
+ case o = v.__getobj__
+ when Hash, Array
+ v = Marshal.dump(o).freeze
+ lambda{klass.new(Marshal.load(v))}
+ else
+ v
+ end
else
v
end
Using proc{[{}]}
fixes the problem I'm hitting, but the applying the diff doesn't, which is weird because it passes the test case I added to this PR.
I'll try to find out the reason and report back.
Using
proc{[{}]}
fixes the problem I'm hitting, but the applying the diff doesn't, which is weird because it passes the test case I added to this PR.
The applying diff only affects cases where the plugin is parsing the default values from the database. If a user is overriding a default value, it is on them to use a callable object that returns a unique value.
Got it, thanks for the fix :)
Previously, default values were returned as a shallow copy, which works fine for most of the cases. However if the returned object is a complex object such as Hash or Array, which can be representation of PostgreSQL's JSONB column for example, shallow copy is not enough. If caller updates the returned object, default value also gets overwritten, thus all subsequent calls use the updated value.
Marshal.dump & Marshal.load is used to generate a deep copy of the object, which might be an overkill, but this seems like a most reliable way of creating a deep copy.