all-of-us / workbench-snippets

Code snippets for use in All of Us Workbench notebooks.
BSD 3-Clause "New" or "Revised" License
14 stars 6 forks source link

Parallelize retrieval of snapshot comments. #75

Closed deflaux closed 2 years ago

deflaux commented 2 years ago

Fixes https://github.com/all-of-us/workbench-snippets/issues/54

For a workspace making heavy use of HTML snapshots, the show all comments tab was taking more than 3 minutes to display its output for 72 snapshots. Now it takes less than two seconds! The biggest gain was made by switching from using gsutil for the repetitive task to using a Python GCS library, but the addition of multiprocess also helps with the speed up.

Unfortunately we don't have automated testing configured for the code in this repository yet so we set up this checklist as an automatic reminder:

Questions? See CONTRIBUTING.md or file an issue so that we can get it documented!

deflaux commented 2 years ago

@rfrancis1 just FYI regarding this speed up