whole-tale / girder_wholetale

Girder plugin providing basic Whole Tale functionality
BSD 3-Clause "New" or "Revised" License
3 stars 5 forks source link

Add a basic metrics tracking capability for common functions #564

Closed Xarthisius closed 1 year ago

Xarthisius commented 1 year ago

Arguably this should happen 5yr ago, but here we are... This PR introduces a custom logger that saves records of particular actions in the database. Currently it tracks:

wouldn't it be nice to have those metrics for NSF reporting?

How to test

  1. Deploy
  2. Do all the action above
  3. In girder-shell:
    from girder.plugins.wholetale.lib.metrics import Record
    list(Record().find())
codecov[bot] commented 1 year ago

Codecov Report

Base: 92.80% // Head: 92.83% // Increases project coverage by +0.02% :tada:

Coverage data is based on head (0cd50e1) compared to base (fe022c0). Patch coverage: 95.83% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #564 +/- ## ========================================== + Coverage 92.80% 92.83% +0.02% ========================================== Files 60 61 +1 Lines 4820 4868 +48 ========================================== + Hits 4473 4519 +46 - Misses 347 349 +2 ``` | [Impacted Files](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale) | Coverage Δ | | |---|---|---| | [server/schema/misc.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL3NjaGVtYS9taXNjLnB5) | `100.00% <ø> (ø)` | | | [server/models/tale.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL21vZGVscy90YWxlLnB5) | `97.28% <81.81%> (-0.82%)` | :arrow_down: | | [server/\_\_init\_\_.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL19faW5pdF9fLnB5) | `90.88% <100.00%> (+0.14%)` | :arrow_up: | | [server/lib/metrics.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL2xpYi9tZXRyaWNzLnB5) | `100.00% <100.00%> (ø)` | | | [server/models/instance.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL21vZGVscy9pbnN0YW5jZS5weQ==) | `87.50% <100.00%> (+0.29%)` | :arrow_up: | | [server/tasks/copy\_workspace.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL3Rhc2tzL2NvcHlfd29ya3NwYWNlLnB5) | `100.00% <100.00%> (ø)` | | | [server/tasks/import\_binder.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL3Rhc2tzL2ltcG9ydF9iaW5kZXIucHk=) | `93.58% <100.00%> (+0.06%)` | :arrow_up: | | [server/tasks/import\_git\_repo.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL3Rhc2tzL2ltcG9ydF9naXRfcmVwby5weQ==) | `97.50% <100.00%> (+0.06%)` | :arrow_up: | | [server/tasks/import\_tale.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL3Rhc2tzL2ltcG9ydF90YWxlLnB5) | `100.00% <100.00%> (ø)` | | | [server/tasks/register\_dataset.py](https://codecov.io/gh/whole-tale/girder_wholetale/pull/564/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale#diff-c2VydmVyL3Rhc2tzL3JlZ2lzdGVyX2RhdGFzZXQucHk=) | `100.00% <100.00%> (ø)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=whole-tale)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

craig-willis commented 1 year ago

This works as expected, but raises questions for me about how these specific events tie to higher-level metrics we can use in reporting. Some of these tend towards technical (instance deletion) and may lack certain details (do we track instances of specific environments if the environment changes?)

It made me think about about what reporting could look like (ala csdms) and led me to an oldie but goodie. Obviously many of these are just reports on existing data, but it might be helpful to think through any additional data we need to collect?

Xarthisius commented 1 year ago

tbh I wasn't even thinking about reporting for funding agencies. I wanted to see answers to questions that I personally have. DELETE /instance metric in particular keeps the state of the instance being deleted. That's useful to get a % of failures. Currently, I only track taleId in instance related events, but could just easily add imageId being used.

"Thinking through" is precisely the point of this PR. Additional suggestions are highly encouraged :)

Xarthisius commented 1 year ago

Added info about container that was running to instance.remove metrics:

 {'_id': ObjectId('639b44cc7c7bb3fb9e8a4664'),
  'type': 'instance.remove',
  'details': {'id': ObjectId('639b449b7c7bb3fb9e8a4635'),
   'taleId': ObjectId('639b41b4e81d22b4bb8402d4'),
   'status': 1,
   'containerInfo': {'mountPoint': '/var/lib/docker/volumes/639b41b4e81d22b4bb8402d4_kowalikk_5Tb5Yb/_data',
    'digest': 'registry.local.wholetale.org/tale/b3fe3003ade99e0d3b07702430c16712:20f366cbe8bed898977c28417230ebcb@sha256:bdf23063af6b67b3590a9ed604aee1ddd7aca08ace2908ca95461c040f2b6e63',
    'nodeId': 'i4q72m3jebgjdd076x0ifm7b8',
    'volumeName': '639b41b4e81d22b4bb8402d4_kowalikk_5Tb5Yb',
    'imageId': ObjectId('639b40ecd8d2b79b6089fd4c'),
    'name': 'tmp-nysubzrklvpq'}},
  'ip': '10.255.0.2',
  'userId': ObjectId('639b41ade81d22b4bb8402d2'),
  'when': datetime.datetime(2022, 12, 15, 16, 1, 16, 146000)
}