Closed adarshp closed 4 years ago
Hi Adarsh,
Thank you! I will do it.
Best wishes, Runnan Zhou
On Apr 19, 2020, at 11:53 AM, Adarsh Pyarelal notifications@github.com wrote:
External Email
@runnanzhou https://github.com/runnanzhou - you might have done parts of this process already, but I wanted to document the task in detail for you, it might help. Apologies, I had meant to do this earlier, but am just getting around to it now.
We need to develop an executable that can process input webcam live video (or video files or images from disk) and output facial action units that are automatically detected using OpenFace in a JSON format.
Here are the implementation steps. To start with, we will implement the 'static' method, that requires just the frames and not a video on disk or in memory (the dynamic method is more accurate since it performs AU normalization, but is more complicated to implement).
Update src/WebcamSensor.cpp and src/WebcamSensor.h to prefix the OpenFace included headers with OpenFace/ i.e. change the following line:
include "GazeEstimation.h"
to
include <OpenFace/GazeEstimation.h>
Do the same substitution for LandmarkCoreIncludes.h, SequenceCapture.h, VisualizationUtils.h, and Visualizer.h in src/WebcamSensor*.cpp.
Create a file called src/AUSensor.cpp, create an int main(...) function in it.
Add a line at the top of src/AUSensor.cpp to include the WebcamSensor.h header file. Inside the main function, add a line that creates an object that is an instance of the WebcamSensor class.
Modify the arguments attribute in WebcamSensor.h, adding the string -au_static to the vector.
Create a while loop in the main function that keeps calling the get_observation method of the WebcamSensor class, until the SIGTERM signal is received, at which point the loop should exit.
Add the line add_subdirectory(external/OpenFace) at some point before the call to add_subdirectory(src) in tomcat/CMakeLists.txt
Add a line called add_executable(ausensor AUSensor.cpp) in src/CMakeLists.txt.
Immediately after that line, add the line
target_link_libraries(ausensor PUBLIC OpenFace) You might need to add the line
target_include_directories(ausensor ${openface_include_dirs}) as well.
Navigate to the build/ directory in the tomcat root directory and execute
cmake .. make -j ausensor Test the functionality of the executable by running ./bin/ausensor (from within the build directory). You should see face landmarks being tracked, as well as the gaze.
At this point, you will have recreated the original functionality of the WebcamSensor class when it was integrated into the runExperiment executable. Once you have verified this, then modify src/WebcamSensor.cpp to get the action units by looking at https://github.com/TadasBaltrusaitis/OpenFace/blob/ad1b3cc45ca05c762b87356c18ad030fcf0f746e/exe/FeatureExtraction/FeatureExtraction.cpp#L190-L258 https://github.com/TadasBaltrusaitis/OpenFace/blob/ad1b3cc45ca05c762b87356c18ad030fcf0f746e/exe/FeatureExtraction/FeatureExtraction.cpp#L190-L258 and adding the appropriate portions of that code to src/WebcamSensor.cpp to perform AU extraction.
Once you have a minimal setup going, then modify the program to output the AUs to the standard output in JSON format. You can use the nlohmann-json library for this (see src/Mission.cpp for an example of usage). The relevant attributes of the RecorderOpenFace class to work with are au_intensities and au_occurrences: https://github.com/ml4ai/tomcat/blob/08115c44d42ad552a2fac08d1f566a785a778c66/external/OpenFace/lib/Utilities/include/RecorderOpenFace.h#L189-L190 https://github.com/ml4ai/tomcat/blob/08115c44d42ad552a2fac08d1f566a785a778c66/external/OpenFace/lib/Utilities/include/RecorderOpenFace.h#L189-L190. Start with a simple JSON format, then we can see how to massage it into the form that the TA3 testbed expects. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ml4ai/tomcat/issues/113, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJFHXEUBPUST64ZPESO7IVTRNNCBJANCNFSM4ML452DA.
@shreeyajain Based on our conversation with the ASU folks today, I think these are the action items we have.
ausensor
(and by extension, AUSensor.cpp
to something more generic. Some suggestions: ofsensor
, openface_sensor
, face_sensor
(but feel free to use another name if you can think of something catchy!)src/cpp/webcam/README.md
and add the instructions to compile and run it on Windows).In order to incorporate new information and make the output format more compatible with the TA3 testbed, we should change the output format. I propose the format shown in the example below.
{
"header": {
"timestamp": "2019-12-26T12:47:23.1234Z",
"message_type": "observation",
"version": "0.1"
},
"msg": {
"experiment_id": "563e4567-e89b-12d3-a456-426655440000",
"trial_id": "123e4567-e89b-12d3-a456-426655440000",
"timestamp": "2019-12-26T14:05:02.1412Z",
"source": "ofsensor",
"sub_type": "state",
"version": "0.1"
},
"data": {
"playername": "Aptiminer1",
"landmark_detection_confidence": 0.94848,
"landmark_detection_success": true,
"frame": 1,
"action_units": {
"AU04": {
"intensity": 0.7257351100546178,
"occurrence" : 1.0
},
"AU05": {
"intensity": 0.35402671613589914,
"occurrence" : 0.0
},
...
},
"gaze": {
"eye_0" : {
"x": 0.10917,
"y": 0.147619,
"z": -0.983001
},
"eye_1": {
"x": -0.166114,
"y": 0.136956,
"z": -0.97655
},
"gaze_angle": {
"x": ...,
"y": ...
}
}
}
}
Notes:
experiment_id
, trial_id
, and playername
should be set to null
unless they are specified with command line options (see the next section).ofsensor
is a placeholder - if you rename the executable something else, that would go in this field.microsec_clock::universal_time()
function from the Boost.Date_Time library might be the way to go.header
object can be the same as the timestamp in the msg
object. We may need to change this later.landmark_detection_confidence
and landmark_detection_success
can be found here: https://github.com/TadasBaltrusaitis/OpenFace/wiki/Output-Format (in their output format, the corresponding names are confidence
and success
)tomcat/external/OpenFace/lib/Utilities/src/RecorderCSV.cpp
for hints on how to extract the landmark_detection_success
, landmark_detection_confidence
and gaze_angle
numbers.--mloc
to the executable that specifies the location of the models. There should also be some fallback options if the user doesn't set the --mloc
flag: the program should first check the OPENFACE_MODELS_DIR
environment variable to see if it is non-empty, and if so, use that as the model directory. If neither the environment variable nor the mloc
flags are set, the program should throw a runtime exception, telling the user to either use the mloc
flag or the environment variable to point the program to the directory containing the models.--exp_id
: If this flag is set, then the experiment
key in the JSON output will be set to the value provided.--trial_id
: If this flag is set, then the trial
key in the JSON output will be set to the value provided.--playername
: If this flag is set, then the trial
key in the JSON output will be set to the value provided.From Federico: It might be useful to have eye_lmk
and pose data as well to figure out which quadrant of the screen participants are looking at.
@shreeyajain - another thing that just occurred to me - we'll need to add a command line option -f
, that will enable reading and processing frames from a file instead of a webcam, since the tool will not be run during the experimental trial, but rather will be used on postprocessed video data (i.e. the cropped recording of the Zoom session)
@shreeyajain - another thing that just occurred to me - we'll need to add a command line option
-f
, that will enable reading and processing frames from a file instead of a webcam, since the tool will not be run during the experimental trial, but rather will be used on postprocessed video data (i.e. the cropped recording of the Zoom session)
@adarshp - I looked into the documentation for OpenFace and I believe we can just give the filename as command line argument with the -f option. I will have to include an outer while loop which can help detect these arguments, and default to webcam in their absence.
You don't need to include a while loop - you can just use the Boost program options library to implement the command line option parsing (with defaults).
Renaming the executable The executable has been renamed from ausensor to facesensor.
You can now navigate to the build/ directory and execute:
$ cmake ..
$ make -j facesensor
$ ./bin/facesensor
From Federico: It might be useful to have
eye_lmk
and pose data as well to figure out which quadrant of the screen participants are looking at.
I have updated the output format to incorporate eye_lmk and pose:
{
"header": {
"timestamp": "2020-07-16T05:06:56.965755Z",
"message_type": "observation",
"version": "0.1"
},
"msg": {
"experiment_id": "563e4567-e89b-12d3-a456-426655440000",
"trial_id": "123e4567-e89b-12d3-a456-426655440000",
"timestamp": "2020-07-16T05:06:56.965755Z",
"source": "facesensor",
"sub_type": "state",
"version": "0.1"
},
"data": {
"playername": "shreeya08",
"landmark_detection_confidence": "0.97500",
"landmark_detection_success": true,
"frame": 8,
"action_units": {
"AU01": {
"occurrence": 0.0,
"intensity": 0.4174251010327605
},
"AU02": {
"occurrence": 0.0,
"intensity": 0.06606532441180364
},
...
},
"gaze": {
"eye_0": {
"x": -0.042032960802316666,
"y": -0.037290651351213455,
"z": -0.9984200596809387
},
"eye_1": {
"x": -0.28871601819992065,
"y": 0.045460283756256104,
"z": -0.9563348889350891
},
"gaze_angle": {
"x": -0.16761472821235657,
"y": 0.0041793398559093475
},
"eye_lmk2d": {
"x_0": 289.2476806640625,
"x_1": 291.44573974609375,
...
"x_55": 374.8895263671875,
"y_0": 392.86376953125,
"y_1": 386.2491760253906,
...
"y_55": 388.72772216796875
},
"eye_lmk3d": {
"X_0": -20.11672019958496,
"X_1": -18.630495071411133,
...
"X_55": 35.71424102783203,
"Y_0": 99.99628448486328,
"Y_1": 95.42163848876953,
...
"Y_55": 96.77069091796875,
"Z_0": 327.07647705078125,
"Z_1": 326.22967529296875,
...
"Z_55": 325.328369140625
}
},
"pose": {
"Tx": 18.100841522216797,
"Ty": 156.148193359375,
"Tz": 388.6546630859375,
"Rx": -0.09204348921775818,
"Ry": 0.10995744913816452,
"Rz": -0.068435437977314
}
}
}
@adarshp Do you have any suggested changes or is this okay? Additionally, should I set the precision (refer to https://github.com/TadasBaltrusaitis/OpenFace/blob/658a6a1cc2028f034c8f29233a01ddc3f9fd6672/lib/local/Utilities/src/RecorderCSV.cpp)?
Command line options added Allowed options:
-h [ --help ] produce help message
--exp_id arg (=null) set experiment ID
--trial_id arg (=null) set trial ID
--playername arg (=null) set playername
--mloc arg set OpenFace models directory
--indent arg (=0) set indentation (true/false)
-f [ --file ] arg (=null) specify the input video file
@adarshp Do you have any suggested changes or is this okay? Additionally, should I set the precision (refer to https://github.com/TadasBaltrusaitis/OpenFace/blob/658a6a1cc2028f034c8f29233a01ddc3f9fd6672/lib/local/Utilities/src/RecorderCSV.cpp)?
This looks great! Don't worry about the precision - people can reduce the precision downstream if they want.
I just realized, if we are going to try piping the output of facesensor
into mosquitto_pub
, we'll need to make each message a single line (i.e. calling dump() instead of dump(4) in WebcamSensor.cpp). However, it is nice to be able to get indented output for debugging. Can you also add a boolean command line flag --indent
(that defaults to false) that controls whether the output is indented or not?
I just realized, if we are going to try piping the output of
facesensor
intomosquitto_pub
, we'll need to make each message a single line (i.e. calling dump() instead of dump(4) in WebcamSensor.cpp). However, it is nice to be able to get indented output for debugging. Can you also add a boolean command line flag--indent
(that defaults to false) that controls whether the output is indented or not?
Sure, I'll do that!
@shreeyajain - another thing that just occurred to me - we'll need to add a command line option
-f
, that will enable reading and processing frames from a file instead of a webcam, since the tool will not be run during the experimental trial, but rather will be used on postprocessed video data (i.e. the cropped recording of the Zoom session)
When we process frames from a file, we might not need visualization. Should we make visualization optional as well?
Yes, we should make it optional.
Closed by #187 .
@runnanzhou - you might have done parts of this process already, but I wanted to document the task in detail for you, it might help. Apologies, I had meant to do this earlier, but am just getting around to it now.
We need to develop an executable that can process input webcam live video (or video files or images from disk) and output facial action units that are automatically detected using OpenFace in a JSON format.
Here are the implementation steps. To start with, we will implement the 'static' method, that requires just the frames and not a video on disk or in memory (the dynamic method is more accurate since it performs AU normalization, but is more complicated to implement).
[x] Update
src/WebcamSensor.cpp
andsrc/WebcamSensor.h
to prefix the OpenFace included headers withOpenFace/
i.e. change the following line:to
[x] Do the same substitution for
LandmarkCoreIncludes.h
,SequenceCapture.h
,VisualizationUtils.h
, andVisualizer.h
insrc/WebcamSensor*.cpp
.[x] Create a file called
src/AUSensor.cpp
, create anint main(...)
function in it.[x] Add a line at the top of
src/AUSensor.cpp
to include theWebcamSensor.h
header file. Inside the main function, add a line that creates an object that is an instance of the WebcamSensor class.[x] Modify the
arguments
attribute inWebcamSensor.h
, adding the string-au_static
to the vector.[x] Create a while loop in the main function that keeps calling the
get_observation
method of the WebcamSensor class, until the SIGTERM signal is received, at which point the loop should exit.[x] Add the line
add_subdirectory(external/OpenFace)
at some point before the call toadd_subdirectory(src)
intomcat/CMakeLists.txt
[x] Add a line called
add_executable(ausensor AUSensor.cpp)
insrc/CMakeLists.txt
.[x] Immediately after that line, add the line
[x] You might need to add the line
as well.
[x] Navigate to the
build/
directory in the tomcat root directory and execute[x] Test the functionality of the executable by running
./bin/ausensor
(from within thebuild
directory). You should see face landmarks being tracked, as well as the gaze.At this point, you will have recreated the original functionality of the WebcamSensor class when it was integrated into the
runExperiment
executable. Once you have verified this, then modifysrc/WebcamSensor.cpp
to get the action units by looking at https://github.com/TadasBaltrusaitis/OpenFace/blob/ad1b3cc45ca05c762b87356c18ad030fcf0f746e/exe/FeatureExtraction/FeatureExtraction.cpp#L190-L258, and adding the appropriate portions of that code tosrc/WebcamSensor.cpp
to perform AU extraction.nlohmann-json
library for this (seesrc/Mission.cpp
for an example of usage). The relevant attributes of theRecorderOpenFace
class to work with areau_intensities
andau_occurrences
: https://github.com/ml4ai/tomcat/blob/08115c44d42ad552a2fac08d1f566a785a778c66/external/OpenFace/lib/Utilities/include/RecorderOpenFace.h#L189-L190. Start with a simple JSON format, then we can see how to massage it into the form that the TA3 testbed expects.