Open elrayle opened 4 years ago
One significant difference will be handling changes in persisting changes to the FileMetadata. You won't be able to just query for it, change it, and save it. With this strategy changing one FileMetadata node will probably save all the other ones in the Fileset (at least for now)
One significant difference will be handling changes in persisting changes to the FileMetadata. You won't be able to just query for it, change it, and save it. With this strategy changing one FileMetadata node will probably save all the other ones in the Fileset (at least for now)
as discussed in slack, this is particularly an issue for Wings, which would need to query each FileMetadata
independently in order to access the FileSet
; i.e. it would create an N+1 query problem for FileSet
access, over the number of FileMetadata
objects.
there's a second question about whether the one-FileSet per File restriction is acceptable (discussion yesterday concluded loosely "yes").
and still a third question about whether it's acceptable for FileMetadata saves to necessitate FileSet saves. i'm less certain about this last one, leaning toward "it's not ideal, but may be worth it for the other benefits of the nesting model".
i'm stuck on the first issue though, and don't think we can seriously consider nesting without well considered benchmarks showing the N+1 issue to be a non-problem up to a "reasonable" number of FileSets. how many is "reasonable"? without telemetry data, i think our best bet would be to ask in slack, via email, and in Samvera Tech.
Descriptive summary
In Valkyrie, there are two options for making an association between two resources:
This Issue explores the impact of these choices in Wings adapter and in Hyrax in general.
Impact on Wings
Abbreviations:
query_service.find_by(id: file_set)
NOTE: This assumes that VR FileSet with embedded FileMetadata expects an eager load of nested resources and that the nested resource is accessible in-memory from the VR FileSet (i.e. vr_file_set.files.each -- all files are already in-memory and ready to be processed)
persister.save(resource: file_set)
This is not substantially different performance wise.
custom_queries
find_file_metadata_id(id:) find_file_metadata_by_alternate_identifier(alternate_identifier:) find_many_file_metadata_by_ids(ids:) find_many_file_metadata_by_use(resource:, use:) where resource is the VR file_set
Needs more analysis.