Closed Dasfaust closed 1 year ago
Hi @Dasfaust!
Thank you for your pull request and welcome to our community.
In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.
In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.
Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed
. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.
If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!
To clarify whether devices/webcam/test/thumbs_up.png is copy-right free? And can you please add the source of this file?
To clarify whether devices/webcam/test/thumbs_up.png is copy-right free? And you please add the source of this file?
Looks like it's for personal use only. I got it from here. I'll remove it in the next commit.
If you can include a demo video link or a gif image, that would be useful for users to understand the functionality.
Please also add github action support, reference: https://github.com/facebookresearch/labgraph/actions/workflows/main.yml
We have committed quite a few changes, notable additions are multiple camera support, multiple MediaPipe solution support, and integration with LabGraph's HDF5 logger. The PR's description has been updated with a better overview, a GIF preview, a Jupyter Notebook example, and other usage details/thoughts.
One issue I've noticed though is replaying large (500mb+) log files using HDF5Reader: the replay process will crash with just [Cthulhu][ERROR]
. Tested on Windows with Python 3.9.13 and LabGraph 2.0.0. If we're not doing anything wrong, is this a memory constraint maybe? If that's the case, one solution might be to create an incremental reader instead of pulling the entire log into memory at once.
Please also add github action support, reference: https://github.com/facebookresearch/labgraph/actions/workflows/main.yml
What kind of actions should be added? Should we use the ATI force touch entries as a template?
For example demo image attached in this PR, how many cameras are used? When supporting 4 video streams, do you know how does time synchronization work across 4 video streams?
One issue I've noticed though is replaying large (500mb+) log files using HDF5Reader: the replay process will crash with just
[Cthulhu][ERROR]
. Tested on Windows with Python 3.9.13 and LabGraph 2.0.0. If we're not doing anything wrong, is this a memory constraint maybe? If that's the case, one solution might be to create an incremental reader instead of pulling the entire log into memory at once.
Replay function is relatively new for labgraph, incremental reader may need to use in this case.
Please also add github action support, reference: https://github.com/facebookresearch/labgraph/actions/workflows/main.yml
What kind of actions should be added? Should we use the ATI force touch entries as a template?
You can refer to LabGraph Monitor as an example, link.
To control the size of the extension, can you share a link instead rather than upload the h5 file to the repo: devices/webcam/logs/logging_example.h5
For example demo image attached in this PR, how many cameras are used? When supporting 4 video streams, do you know how does time synchronization work across 4 video streams?
Just one camera in the demo.
There is no coupling, each stream is ran independently in its own process. Results are logged/displayed as soon as they become available.
Replay function is relatively new for labgraph, incremental reader may need to use in this case.
Excellent, we'll see what we can do.
You can refer to LabGraph Monitor as an example, link.
Okay, thanks. Will do once tests are re-introduced, working on that now.
To control the size of the extension, can you share a link instead rather than upload the h5 file to the repo: devices/webcam/logs/logging_example.h5
I have moved logging_example.h5
and preview.gif
(it was also big) out of the repository.
Please see my comments as below in bold:
in the demo.
There is no coupling, each stream is ran independentl
For example demo image attached in this PR, how many cameras are used? When supporting 4 video streams, do you know how does time synchronization work across 4 video streams?
Just one camera in the demo.
There is no coupling, each stream is ran independently in its own process. Results are logged/displayed as soon as they become available.
In this case, strictly speaking, multi-camera synchronization is not yet supported, as each stream is ran independently. Please mention this limitation in README.
@jfResearchEng
The annotation source from the meeting has been uploaded here.
Hi Dasfaust,
Thanks for the update. Do you also happen to have a link to the annotated video based on the video I shared earlier? And have you also shared the code generating the offline annotated video somewhere?
Thanks, jfResearchEng
@jfResearchEng
The annotation source from the meeting has been uploaded here.
Description
PoseVis
PoseVis is a LabGraph extension that streams any number of video sources and generates pose landmark data from MediaPipe for each stream independently. MediaPipe Hands, Face Mesh, Pose, and Holistic solutions are supported. PoseVis supports data logging and replaying via the HDF5 format. See Using PoseVis for details.
Usage preview
PoseVis can also support other image processing tasks through its extension system. Take a look at the hands extension for an example.
Installation
PoseVis uses OpenCV to handle video streams. Out of the box, PoseVis streams camera, video file, and image directory sources from the
MSMF
backend in Windows, andV4L2
backend in Linux, with the MJPEG format (see OpenCV backends here). This configuration should be supported by most UVC devices. Further source stream customization can be achieved by installing GStreamer; steps are detailed below.PoseVis General Setup
Requires Python 3.8 or later. Run
setup.py
to install required packages from PyPi:See Using PoseVis for usage details.
GStreamer Support (Optional)
GStreamer is a multimedia framework that allows you to create your own media pipelines with a simple string input. If you need more flexibility than a simple MJPEG stream, you can install GStreamer using the steps below.
Example GStreamer Configurations
PoseVis expects color formats from GStreamer to be in the
BGR
color space, and OpenCV requires the use of appsink.Creating a test source: this configuration creates the videotestsrc element and configures a 720p @ 30Hz stream in BGR.
Creating a device source in Linux: this configuration captures an MJPEG stream at 720p @ 30Hz from a V4L2 device and converts the image format into raw BGR.
You can also specify per-camera configurations:
Windows GStreamer Support
Follow the Windows GStreamer guide.
Linux GStreamer Support
Follow the Linux GStreamer guide.
Performance
Performance is crucial for real time applications. Check the benchmark notebook example for performance metrics, including details of the system used for benchmarking. You can also run the notebook on your system to get an idea of how PoseVis will perform.
Using PoseVis
Test PoseVis via Command Line
Check usage details:
Using PoseVis in Your Project
Check the usage guide for an in-depth overview of the concepts used in PoseVis and how to hook into its LabGraph topics.
PoseVis Usage Examples
GestureVis
GestureVis uses data from the MediaPipe hand and body pose extensions to guess the current gesture based on a list of known gestures and draws the appropriate annotations onto the video stream, both online and offline. Check out the hands version here and body pose version here.
Logging Example
The logging example notebook shows a simple way to use HDF5 logging with PoseVis.
Type of change
Please delete options that are not relevant.
Feature/Issue validation/testing
Please describe the tests [UT/IT] that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced. Please also list any relevant details for your test configuration.
python -m unittest pose_vis/test/tests.py
Runs the graph on a sample image and checks each extension's output. Example result:
Checklist: