ambianic / fall-detection

Python ML library for people fall detection
Apache License 2.0
83 stars 16 forks source link

feat: add MoveNet and PoseNet notebooks for experiment #17

Closed bhavikapanara closed 3 years ago

bhavikapanara commented 3 years ago

Here is the Link to access notebooks

commit-lint[bot] commented 3 years ago

Contributors

bhavikapanara

Commit-Lint commands
You can trigger Commit-Lint actions by commenting on this PR: - `@Commit-Lint merge patch` will merge dependabot PR on "patch" versions (X.X.Y - Y change) - `@Commit-Lint merge minor` will merge dependabot PR on "minor" versions (X.Y.Y - Y change) - `@Commit-Lint merge major` will merge dependabot PR on "major" versions (Y.Y.Y - Y change) - `@Commit-Lint merge disable` will desactivate merge dependabot PR - `@Commit-Lint review` will approve dependabot PR - `@Commit-Lint stop review` will stop approve dependabot PR
review-notebook-app[bot] commented 3 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

codecov[bot] commented 3 years ago

Codecov Report

Merging #17 (b84a92d) into main (8a053b9) will increase coverage by 3.22%. The diff coverage is 92.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #17      +/-   ##
==========================================
+ Coverage   89.06%   92.29%   +3.22%     
==========================================
  Files          10       14       +4     
  Lines         869     1272     +403     
==========================================
+ Hits          774     1174     +400     
- Misses         95       98       +3     
Impacted Files Coverage Δ
fall_prediction.py 100.00% <ø> (ø)
src/pipeline/inference.py 88.00% <ø> (-0.24%) :arrow_down:
tests/pipeline/test_fall_detect_posenet.py 100.00% <ø> (ø)
src/pipeline/pose_engine.py 65.65% <38.00%> (-23.41%) :arrow_down:
src/pipeline/pose_base.py 91.83% <91.83%> (ø)
src/pipeline/movenet_model.py 97.05% <97.05%> (ø)
src/pipeline/fall_detect.py 98.62% <100.00%> (+10.09%) :arrow_up:
src/pipeline/posenet_model.py 100.00% <100.00%> (ø)
tests/pipeline/test_fall_detect_movenet.py 100.00% <100.00%> (ø)
... and 5 more
ivelin commented 3 years ago

@bhavikapanara this is an improvement. It shows MoveNet is a superior model in most cases, although in some cases it incorrectly assigns keypoints to non-human objects.

Several items that still need work:

  1. I did not see in the test set the following images from this user reported issue. Please crop the photos and add to the test set. That will help us understand if movenet is more robust in terms of non-human object distractions: https://github.com/ambianic/fall-detection/issues/4

  2. Another set of tests we are also missing is multiple people in the same image. Since MoveNet is designed for scenes with one person, it would be good to know whether it focuses on one person in a scene with multiple people or instead it gets confused and scrambles keypoints across multiple people.

  3. Also, please add to the comparison matrix keypoint labels and confidence scores for drawn keypoints. It can be in a json format next or under each image. That would help us understand whether movenet is able to detect with higher confidence level than posetnet.

Thank you!

bhavikapanara commented 3 years ago

@bhavikapanara this is an improvement. It shows MoveNet is a superior model in most cases, although in some cases it incorrectly assigns keypoints to non-human objects.

Several items that still need work:

I did not see in the test set the following images from this user reported issue. Please crop the photos and add to the test set. That will help us understand if movenet is more robust in terms of non-human object distractions: Implement more test cases based on public and user contributed video and image data of elderly falls

Another set of tests we are also missing is multiple people in the same image. Since MoveNet is designed for scenes with one person, it would be good to know whether it focuses on one person in a scene with multiple people or instead it gets confused and scrambles keypoints across multiple people .

5

Also, please add to the comparison matrix keypoint labels and confidence scores for drawn keypoints. It can be in a json format next or under each image. That would help us understand whether movenet is able to detect with higher confidence level than posetnet.

Thank you!

@ivelin I have updated the notebook with added multiple person images in test samples and also add comparison part of two models' key-point confidence scores.

please review it and let me know if there are any changes.

ivelin commented 3 years ago

@bhavikapanara now the notebook shows the information we can use to compare the models objectively.

However there seems to be a bug in the keypoint scores shown. There are several cases where the posenet image has no keypoints drawn, yet the scores are shows as all 50%.

posenet         0.500000         0.500000    0.500000    0.500000

Please double check the score formulas and rendering. Thank you.

@bhavikapanara this is an improvement. It shows MoveNet is a superior model in most cases, although in some cases it incorrectly assigns keypoints to non-human objects. Several items that still need work: I did not see in the test set the following images from this user reported issue. Please crop the photos and add to the test set. That will help us understand if movenet is more robust in terms of non-human object distractions: Implement more test cases based on public and user contributed video and image data of elderly falls Another set of tests we are also missing is multiple people in the same image. Since MoveNet is designed for scenes with one person, it would be good to know whether it focuses on one person in a scene with multiple people or instead it gets confused and scrambles keypoints across multiple people .

5

Also, please add to the comparison matrix keypoint labels and confidence scores for drawn keypoints. It can be in a json format next or under each image. That would help us understand whether movenet is able to detect with higher confidence level than posetnet. Thank you!

@ivelin I have updated the notebook with added multiple person images in test samples and also add comparison part of two models' key-point confidence scores.

please review it and let me know if there are any changes.

bhavikapanara commented 3 years ago

@bhavikapanara please see comment about incorrect keypoint score values.

@ivelin I have fixed the bug. Can you please check it?

ivelin commented 3 years ago

@bhavikapanara some additional evidence that the posenet keypoint score calculation is off - either in the notebook or the current production code. Note that score calculation should be done in one place (a single source of truth - the fall detection high level API) and the notebook should be simply iterating over the available posedetection implementations without reimplementing low level tensor arithmetics.

See screenshots that demonstrate inconsistencies between the originally reported posenet detections and their average confidence score next to the notebook scores that appear to be inconsistent. Moreover the notebook seems to draw keypoints and body lines in different coordinates as compared to the originals.

105122715-56a53e00-5a9c-11eb-9a00-c6d18420a882

Screen Shot 2021-07-24 at 12 06 28 PM

105122713-560ca780-5a9c-11eb-81ba-3d68fbcbef54

Screen Shot 2021-07-24 at 12 07 31 PM

105122714-56a53e00-5a9c-11eb-897b-55a9a2f17102

Screen Shot 2021-07-24 at 12 04 12 PM
bhavikapanara commented 3 years ago

@bhavikapanara this is a big improvement. A few inline comments plus:

  1. Let's add the inference time in the comparison matrix so we can see how the two models compare given different sample inputs.
  2. Please convert the sample iteration from the notebook matrix into a set of unit tests with matching keypoint scores for each AI model and test image. We will use this as a base line to protect us from future regressions in the code for each AI model.

Thank you!

@ivelin The first point is completed. you can now see inference time in the notebook. But, I didn't get you on 2nd point. Can you please explain it what I need to do exactly?

ivelin commented 3 years ago

@bhavikapanara this is a big improvement. A few inline comments plus:

  1. Let's add the inference time in the comparison matrix so we can see how the two models compare given different sample inputs.
  2. Please convert the sample iteration from the notebook matrix into a set of unit tests with matching keypoint scores for each AI model and test image. We will use this as a base line to protect us from future regressions in the code for each AI model.

Thank you!

@ivelin The first point is completed. you can now see inference time in the notebook. But, I didn't get you on 2nd point. Can you please explain it what I need to do exactly?

@bhavikapanara a couple of comments on the time measurement implementation.

  1. time.time() is not recommended for performance measurement because its granularity is not guaranteed and can be as much as one second. It also does not take into account sleep time. Instead use time.process_time() or timeit.
  2. Your results suggest that movenet is about 2x slower than posenet(with mobilenetv1). That seems inconsistent with your the observations you shared before on our slack channel. Can you comment?

Regarding the second request. It is just an ask to convert the notebook iteration over image samples into unit tests for each of the models. Maybe you can put the image names and expected inference scores in a csv or yaml file that we can expand in the future as we receive additional samples from users. Having this dataset test in our CI will protect the code from regressions.

bhavikapanara commented 3 years ago

@bhavikapanara this is a big improvement. A few inline comments plus:

  1. Let's add the inference time in the comparison matrix so we can see how the two models compare given different sample inputs.
  2. Please convert the sample iteration from the notebook matrix into a set of unit tests with matching keypoint scores for each AI model and test image. We will use this as a base line to protect us from future regressions in the code for each AI model.

Thank you!

@ivelin The first point is completed. you can now see inference time in the notebook. But, I didn't get you on 2nd point. Can you please explain it what I need to do exactly?

@bhavikapanara a couple of comments on the time measurement implementation.

  1. time.time() is not recommended for performance measurement because its granularity is not guaranteed and can be as much as one second. It also does not take into account sleep time. Instead use time.process_time() or timeit.
  2. Your results suggest that movenet is about 2x slower than posenet(with mobilenetv1). That seems inconsistent with your the observations you shared before on our slack channel. Can you comment?

Regarding the second request. It is just an ask to convert the notebook iteration over image samples into unit tests for each of the models. Maybe you can put the image names and expected inference scores in a csv or yaml file that we can expand in the future as we receive additional samples from users. Having this dataset test in our CI will protect the code from regressions.

yes, notebook experiments say movenet is about 2x slower than posenet(with mobilenetv1). I am also surprised by it. This one also represents movenet is Ultrafast. Don't know what's the issue.

ivelin commented 3 years ago

yes, notebook experiments say movenet is about 2x slower than posenet(with mobilenetv1). I am also surprised by it. This one also represents movenet is Ultrafast. Don't know what's the issue.

OK, good to know we are in sync and the numbers are accurate. The performance numbers in the virtual CI environment are a good relative comparison between models that we can keep track of. We will see how these numbers look on real edge devices.

Please proceed with addressing the other comments in this PR so we can merge.

bhavikapanara commented 3 years ago

yes, notebook experiments say movenet is about 2x slower than posenet(with mobilenetv1). I am also surprised by it. This one also represents movenet is Ultrafast. Don't know what's the issue.

OK, good to know we are in sync and the numbers are accurate. The performance numbers in the virtual CI environment are a good relative comparison between models that we can keep track of. We will see how these numbers look on real edge devices.

Please proceed with addressing the other comments in this PR so we can merge.

I think all comments get resolved. Please @ivelin review it and let me know if I missed anything.

github-actions[bot] commented 2 years ago

:tada: This PR is included in version 1.0.0 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket: