ambianic / fall-detection

Python ML library for people fall detection
Apache License 2.0
83 stars 16 forks source link

feat: retrain fall detect model with TFLite Model Maker and more local data #35

Open bhavikapanara opened 2 years ago

bhavikapanara commented 2 years ago

TFLite Model Maker Training Notebook for fall detection: Link

TFLite Model Maker inference Notebook for fall detection: Link

Fall detect on top of person detect notebookLink


Fall detect Base for transfer learning: Link

On-device training model : Link

commit-lint[bot] commented 2 years ago

Features

Contributors

bhavikapanara

Commit-Lint commands
You can trigger Commit-Lint actions by commenting on this PR: - `@Commit-Lint merge patch` will merge dependabot PR on "patch" versions (X.X.Y - Y change) - `@Commit-Lint merge minor` will merge dependabot PR on "minor" versions (X.Y.Y - Y change) - `@Commit-Lint merge major` will merge dependabot PR on "major" versions (Y.Y.Y - Y change) - `@Commit-Lint merge disable` will desactivate merge dependabot PR - `@Commit-Lint review` will approve dependabot PR - `@Commit-Lint stop review` will stop approve dependabot PR
review-notebook-app[bot] commented 2 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

ivelin commented 2 years ago

Model Training Notebook: Link

Model infrernce Notebook: Link

@bhavikapanara please do not submit for review PRs that have merge conflicts or failing checks. Also, please use the PR template with detailed description of the purpose of the PR. Such that anyone who looks at it on the team can understand what it's about. Especially reviewers.

Thank you!

ivelin commented 2 years ago

Link

@bhavikapanara This notebook should include several steps of on-device training to provide baseline for efficacy.

  1. Initial classification model. efficientnet-lite.tflite
  2. Base fall detection model trained on public data. Let's call it base-fall-detect-model.tflite
  3. Custom on-device transfer-learned model applying local data samples (John's home) to the base model from (2). john-custom-transfer-learned-fall-detect-model.tflite
  4. Compare results with confidence scores and full confusion matrices between 2 & 3 on John's home data.

The goal is here to determine:

  1. If the chosen classification transfer learning approach is effective in adapting to local data.
  2. How much does it cost in terms of labeled samples for the transfer learning to be effective. In other words, does John need to fall 10 or 20 or 500 times before the model starts accurately recognizing falls in his home.

Making sense?

As we get a handle on this baseline , we also need to think how to allow John to provide labeling feedback to the model with minimum effort. If a positive fall was not detected correctly, how can John be aware of that and help by labeling the classification correctly? Currently the ambianic edge logic notifies when there is a positive detection above a certain threshold of confidence. How do we design the UX to be effective with training?

Some possible options: 1.Show an Advanced Training Mode button in the UI Settings that starts showing in the app timeline for some period of time (5 minutes, 1 hour?) low confidence detections which the user can label correctly?

  1. Advanced Training Mode button that turns on recording and display in the timeline for 60 seconds every captured camera frame at 1fps (once per second) and then let the user review the 60 frames and label them. Presumably they would be able to fall once or twice to generate enough data in the 60 frames.

Opened a related issue in the ui repo.

Thoughts?

ivelin commented 2 years ago

@bhavikapanara another issue we need to address that I already mentioned in the slack space.

I noticed one more thing with the current Model Maker model. It’s data set is exclusively with images that 1. include people in them and 2. People take a substantial portion of the image.

This means that the model will most likely perform poorly on images where people are not present at all or take a small part of the image.

I see two options here :

  1. Add an extra category of ‘no-person’ so the classifieds learns to put into that third category any images without a person present.
  2. Put the fall/no-fall model in a pipeline after a person detection model. We do this already with face detection and it’s also the approach pose detection takes.

I am leaning towards the latter for two main reasons: On device person detectors are quite good already. If we allow the classifier to focus on a simpler problem , it is more likely that it learns faster to be accurate.

Thoughts?

bhavikapanara commented 2 years ago

@bhavikapanara What exactly have you addressed from the comments I made? I don't see clear answers to my questions. In slack, I addressed the person detection function. It does not belong in the fall detection model implementation. Just in the model card.

I don't understand what you are trying to say.

As you requested, I have executed a person detection model, then cropped each person and ran a fall-detect mode in this notebook.

bhavikapanara commented 2 years ago

Link

@bhavikapanara This notebook should include several steps of on-device training to provide baseline for efficacy.

  1. Initial classification model. efficientnet-lite.tflite
  2. Base fall detection model trained on public data. Let's call it base-fall-detect-model.tflite
  3. Custom on-device transfer-learned model applying local data samples (John's home) to the base model from (2). john-custom-transfer-learned-fall-detect-model.tflite
  4. Compare results with confidence scores and full confusion matrices between 2 & 3 on John's home data.

The goal is here to determine:

  1. If the chosen classification transfer learning approach is effective in adapting to local data.
  2. How much does it cost in terms of labeled samples for the transfer learning to be effective. In other words, does John need to fall 10 or 20 or 500 times before the model starts accurately recognizing falls in his home.

Making sense?

As we get a handle on this baseline , we also need to think how to allow John to provide labeling feedback to the model with minimum effort. If a positive fall was not detected correctly, how can John be aware of that and help by labeling the classification correctly? Currently the ambianic edge logic notifies when there is a positive detection above a certain threshold of confidence. How do we design the UX to be effective with training?

Some possible options: 1.Show an Advanced Training Mode button in the UI Settings that starts showing in the app timeline for some period of time (5 minutes, 1 hour?) low confidence detections which the user can label correctly? 2. Advanced Training Mode button that turns on recording and display in the timeline for 60 seconds every captured camera frame at 1fps (once per second) and then let the user review the 60 frames and label them. Presumably they would be able to fall once or twice to generate enough data in the 60 frames.

Opened a related issue in the ui repo.

Thoughts?

For this comment, to on-device training, how can I get the labelled data. I mean, I need the dataset's location that the user labels during feedback of base fall-detect model prediction.

ivelin commented 2 years ago

@bhavikapanara What exactly have you addressed from the comments I made? I don't see clear answers to my questions. In slack, I addressed the person detection function. It does not belong in the fall detection model implementation. Just in the model card.

I don't understand what you are trying to say.

As you requested, I have executed a person detection model, then cropped each person and ran a fall-detect mode in this notebook.

Look through all pending comments in this PR and try to address each one separately.

I also posted this comment in slack on Oct 14:

OK, I appreciate the effort, but I think the fall detection model should focus on fall classification only and not be doing person cropping. Instead its model card should clarify that it expects images with a substantial portion showing a person. Just like the movenet model card does. Leave the model composition to the pipeline engine in Ambianic Edge Core. So I suggest as follow up steps: Add Model Card section to the fall detection repo's README.md. Focus on the fall classification points we've already discussed and I outlined in my comments to your PR draft.

bhavikapanara commented 2 years ago

@ivelin I have trained on-device model training using TFLite-model-maker with the base model as efficientnet_lite and custom trained fall-detect model using transfer learning approach.

Approach 1: Base model: Default efficientnet_lite

See this notebook Link - On device model training with efficientnet_lite as the base model. Using this method, achieve 82% accuracy.


Approach 2: Base Model: Custom train fall-detect model using transfer learning Notebook

see this notebook Link - on-device model training with the custom fall-detect base model. Using this method, achieve 57% accuracy.