Closed sereneiconoclast closed 3 years ago
This is definitely a use case that would be helpful to support. My thought for this is to provide an enumeration, perhaps where you signal in each loop using whatever logic you wish what model class should be used. On mobile now but can sketch out an example soon.
Here's how I imagined this. Let's pretend we have a couple of tables here:
class Project
include Aws::Record
set_table_name(ENV["TABLE_NAME"])
string_attr :uuid, hash_key: true
string_attr :table_name, range_key: true
string_attr :project_name
end
class Task
include Aws::Record
set_table_name(ENV["TABLE_NAME"])
string_attr :uuid, hash_key: true
string_attr :table_name, range_key: true
string_attr :task_name
string_attr :parent_project_uuid
string_attr :status
end
Fairly simple example, but we could then run this against any table class:
scan = Project.build_scan.multi_model_filter do |raw_item_attributes|
if raw_item_attributes[:table_name] == "PROJECT"
Project
elsif raw_item_attributes[:table_name] == "TASK"
Task
else
nil
end
end
What I'm imagining here is we let you pass in a block rather than complete!
, for example, and the block returns the model class based on any manipulation of the raw item that you like, or nil
if no model applies and it should be skipped. This could also apply to built queries, though as a limitation, you have to have some sort of model class to use as a starting point. It seems like a reasonable compromise though, as you could have a base class for Single-Table query building as needed.
I should add, when you run scan.each
or scan.each
etc, the items in that enumeration would be in the appropriate class as specified by the filter block code.
So presumably, if you use this logic, you need to be prepared for heterogeneous sets, but you're opting in to that behavior anyways.
Nice. So the build_scan
or build_query
is returning a builder as an intermediate result, and the multi_model_filter
is augmenting it... similar to RSpec's syntax for programming mocks: expect(thing).to receive(:method_name).with(...).and_return(...)
An alternate style would be to accept a Proc
as an optional argument, so you could write
BaseTable.query(...normal query terms...,
select_model: ->(raw_attributes) { ... some logic returning Project, Task, or nil }
)
This doesn't look as clean as the style you suggested, but it's probably less work to implement. I'd be happy with either.
So presumably, if you use this logic, you need to be prepared for heterogeneous sets, but you're opting in to that behavior anyways.
Yes, the straightforward behavior would be to return a single array containing objects of various types, in whatever order they were found. It might be nicer for the consumer, perhaps, to return a Hash
sorting them by type:
{
Project => [project_1, project_2, project_3...],
Task => [task_1, task_2...]
}
...since nearly everyone will, as a first step, be sorting through the results in this fashion.
Bonus: Allowing the block to return nil
to mean "skip this" means this also functions similar to aws dynamodb query --filter-expression
.
I've got a draft PR (#108) that implements this - I'm still thinking through some behavior...
It might be nicer for the consumer, perhaps, to return a Hash sorting them by type
Since results are returned page by page it would require iterating through the entire set to build a sorted Hash which in many cases isn't desirable.
I'd say that's actually an unacceptable outcome, it should be returned one page at a time no matter what - otherwise you can accidentally pull up millions of records.
Since results are returned page by page it would require iterating through the entire set to build a sorted Hash which in many cases isn't desirable.
I'm not sure I see what one has to do with the other. You can still return paginated results, it's just that each page would be a Hash with items bucketed by type.
If you don't think it desirable then you could certainly leave that part out. But I know that, as a customer, the first thing I'm going to do with each heterogeneous result set is to divide them up by item type, and I would expect most consumers would do likewise.
Yes true - the important part is the one page at a time thing.
From what I can tell, currently all queries are performed through a specific model type, and must therefore return only records of that type.
I'd like to execute a single query that can return records of mixed types:
Instead of calling
User.query(...)
, I would then callBaseTable.query(...)
(as per issue 92) and pass aProc
whose job is to examine theHash
of raw attribute values, and return a reference to the appropriate child class to instantiate (User
,Order
,Review
). In the example above, I'd probably do that by looking at the first word of the range key. It could do something else instead, such as attempting to match each range key against a regex ("this looks like a phone number"), or switching based on some other attribute (item_type=="User"
), or going by which attributes are present and which aren't.Does this sound reasonable?