microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
27 stars 8 forks source link

Migrations: Implement `set_field_of_each_document` Adapter method #2008

Closed eecavanna closed 4 months ago

eecavanna commented 4 months ago

Proposed method signature

   @abstractmethod
    def set_field_of_each_document(
        self,
        collection_name: str,
        field_name: str,
        value: Union[None, str, int, float, bool],
    ) -> None:
        r"""
        Populates the specified field of each document in the collection, with the
        specified value (that same value will be used for each document).
        """
        pass

This method would be similar to the process_each_document method, but — since the value is independent of anything else — we can use Mongo's update function, which I think will be faster (in terms of execution time) than the "ETL" process used by process_each_document.