Kitware / UPennContrast

UPenn ?
https://upenn-contrast.netlify.com/
Apache License 2.0
8 stars 6 forks source link

Optimize save many #683

Closed bruyeret closed 3 months ago

bruyeret commented 3 months ago

Also reformat all python code and add flake8 linter check to CI

Should solve a lot of performance issues: Fix #670 Fix #628

bruyeret commented 3 months ago

@luciemac you can only review the two first commits, the last one is only styling

arjunrajlab commented 3 months ago

Tried running, I ended up with this error when I tried to build girder:

=> ERROR [girder 14/14] RUN girder build                                                                                                                                           1.1s 
------                                                                                                                                                                                   
 > [girder 14/14] RUN girder build:                                                                                                                                                      
1.013 Traceback (most recent call last):                                                                                                                                                 
1.013   File "/venv/bin/girder", line 8, in <module>                                                                                                                                     
1.013     sys.exit(main())                                                                                                                                                               
1.013              ^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
1.013     return self.main(*args, **kwargs)
1.013            ^^^^^^^^^^^^^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
1.013     rv = self.invoke(ctx)
1.013          ^^^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
1.013     return _process_result(sub_ctx.command.invoke(sub_ctx))
1.013                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
1.013     return ctx.invoke(self.callback, **ctx.params)
1.013            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
1.013     return __callback(*args, **kwargs)
1.013            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/girder/cli/build.py", line 52, in main
1.013     pluginDependencies = _collectPluginDependencies()
1.013                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/girder/cli/build.py", line 108, in _collectPluginDependencies
1.013     for pluginName in allPlugins():
1.013                       ^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/girder/plugin.py", line 211, in allPlugins
1.013     return list(_getPluginRegistry().keys())
1.013                 ^^^^^^^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/girder/plugin.py", line 179, in _getPluginRegistry
1.013     pluginClass = entryPoint.load()
1.013                   ^^^^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/pkg_resources/__init__.py", line 2482, in load
1.013     return self.resolve()
1.013            ^^^^^^^^^^^^^^
1.013   File "/venv/lib/python3.11/site-packages/pkg_resources/__init__.py", line 2488, in resolve
1.014     module = __import__(self.module_name, fromlist=['__name__'], level=0)
1.014              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1.014   File "/src/AnnotationPlugin/upenncontrast_annotation/__init__.py", line 15, in <module>
1.014     from .server.models.annotation import Annotation as AnnotationModel
1.014   File "/src/AnnotationPlugin/upenncontrast_annotation/server/models/annotation.py", line 2, in <module>
1.014     from ..helpers.proxiedModel import ProxiedAccessControlledModel
1.014   File "/src/AnnotationPlugin/upenncontrast_annotation/server/helpers/proxiedModel.py", line 6, in <module>
1.014     from ..models.history import History as HistoryModel
1.014   File "/src/AnnotationPlugin/upenncontrast_annotation/server/models/history.py", line 5, in <module>
1.014     from .documentChange import DocumentChange as DocumentChangeModel
1.014   File "/src/AnnotationPlugin/upenncontrast_annotation/server/models/documentChange.py", line 4, in <module>
1.014     from helpers.customModel import CustomAccessControlledModel
1.014 ModuleNotFoundError: No module named 'helpers'
------
failed to solve: process "/bin/sh -c girder build" did not complete successfully: exit code: 1
arjunrajlab commented 3 months ago

Also @bruyeret, random question: we have noticed that downloading annotations also seems to be fairly slow (on the order of minutes for ~25K annotations). Do you think a similar issue may be going on there? Are those endpoints also querying one at a time?

bruyeret commented 3 months ago

Sorry for the delay @arjunrajlab, the PR should be ready I also changed the library that we use for json validation as it was a bottleneck too, so it took a little bit more time than expected Now most of the things that use to take time are as fast as possible I tried with 50k annotations at it takes 10s to upload 👍 You can merge before @luciemac 's approval if the schedule is tight, she can still review later

arjunrajlab commented 3 months ago

@bruyeret Thank you! Do I need to rebuild workers to take advantage of any changes to the API? Or should it work if I rebuild the front end and Girder?

arjunrajlab commented 3 months ago

@bruyeret OK, wow, considerably faster! I did notice that property computation is still relatively slow. Do I need to rebuild those workers or something? Or did that API call not get fixed yet?

arjunrajlab commented 3 months ago

I checked on property computation (simple blob metric worker), and I got the following error:

(NOTE: when I built the worker image, I used the optimize-save-many branch when I installed UPennContrast.)

[2024-06-04 18:55:55,876: WARNING/ForkPoolWorker-16] compute(datasetId, apiUrl, token, params)
[2024-06-04 18:55:55,880: WARNING/ForkPoolWorker-16] File "/entrypoint.py", line 60, in compute
[2024-06-04 18:55:55,881: WARNING/ForkPoolWorker-16] workerClient.add_multiple_annotation_property_values(dataset_property_value_dict)
[2024-06-04 18:55:55,881: WARNING/ForkPoolWorker-16] File "/UPennContrast/devops/girder/annotation_client/annotation_client/workers.py", line 203, in add_multiple_annotation_property_values
[2024-06-04 18:55:55,881: WARNING/ForkPoolWorker-16] self.annotationClient.addMultipleAnnotationPropertyValues(
  File "/UPennContrast/devops/girder/annotation_client/annotation_client/annotations.py", line 310, in addMultipleAnnotationPropertyValues
[2024-06-04 18:55:55,881: WARNING/ForkPoolWorker-16] return self.client.post(
[2024-06-04 18:55:55,881: WARNING/ForkPoolWorker-16] ^^^^^^^^^^^^^^^^^
  File "/root/miniforge3/envs/worker/lib/python3.11/site-packages/girder_client/__init__.py", line 477, in post
[2024-06-04 18:55:55,881: WARNING/ForkPoolWorker-16] return self.sendRestRequest('POST', path, parameters, files=files,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniforge3/envs/worker/lib/python3.11/site-packages/girder_client/__init__.py", line 462, in sendRestRequest
[2024-06-04 18:55:55,881: WARNING/ForkPoolWorker-16] raise HttpError(
[2024-06-04 18:55:55,881: WARNING/ForkPoolWorker-16] girder_client.HttpError: HTTP error 400: POST http://girder:8080/api/v1//annotation_property_values/multiple
Response text: {"message": "data.values.66168adea63eec625045b4a7 cannot be validated by any definition", "type": "validation"}