Closed eecavanna closed 1 month ago
I'm ready for this to be merged into the nmdc-schema
repo. I want to use it in the berkeley-schema-fy24
fork repo.
I won't delete the branch until you check if my commit did anything nasty, and it can easily be resolved.
@turbomam, thanks for bringing this additional commit to my attention. I looked at its contents (diff) and don't have any concerns. I'll delete the branch now.
Summary
In this branch, I implemented a new adapter method. Its name is
set_field_of_each_document
.I designed the method with the upcoming Berkeley schema migrations in mind. Specifically, one of the migrators assigns the same
type
value to every document in a collection. That migrator currently uses theprocess_each_document
method, which does an ETL (extract, transform, load) process on each document. It's a relatively general-purpose method.In contrast, with this new method — regardless of what the original document contains — this new method always sets the specified field to the specified value (so, instead of ETL, it's just "L" — the loading of the specified value into the existing document). Because this method's job responsibility is more narrow, it can use a more optimized query under the hood. I expect that this method will speed up the Berkeley schema migration.