TadasBaltrusaitis / OpenFace

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
Other
6.84k stars 1.84k forks source link

Is there a planned release date for the Messaging Server? #492

Open SilentSin opened 6 years ago

SilentSin commented 6 years ago

I'm looking to create a system that analyses the Facial Action Units from a webcam feed and applies them to a 3D character model in a Unity application in real-time. To use OpenFace for this I either need to get it to run inside Unity, or have it run as a separate application and stream the output to Unity. The latter approach would also be better for a setup using a wireless head-mounted camera since I could potentially run it on a Raspberry Pi (or similar) on the helmet itself so it only has to send the output instead of trying to stream high quality video.

The Wiki page says "Messaging server (Coming soon)" so I was wondering how "soon" that might be to determine whether I'd have to implement it myself.

NumesSanguis commented 6 years ago

Well, good timing I see this issue. I've been working on that for my master thesis for about a year now: https://github.com/NumesSanguis/FACSvatar

A colleague modified OpenFace GUI to include ZeroMQ as messaging server, which then streams the data to Python, which applies some smoothing functions and allows real-time modification of AU data (and Machine Learning). Then these data is received in 1 or more Unity clients.

Currently this OpenFace modification is in the GUI, and you would have to rebuild OpenFace from source and use the modified "MainWindow.xaml.cs" and "config.xml" found in the folder "openface" on my GitHub. More details: https://github.com/TadasBaltrusaitis/OpenFace/issues/375

I hope that in the future we, or someone, can build ZeroMQ into the C++ code instead of the C# code in the GUI, to have it work on real-time on Ubuntu as well. All other modules work on both Ubuntu 16.04 and Windows 7/10, so you would only need 1 Windows PC with a strong CPU.

A new wave of improvement is coming in about 1 week, to include multi-3D character support. I've submitted a paper about my project to a conference, so hopefully I'll hear soon it gets accepted.

SilentSin commented 6 years ago

That's awesome timing. I was working on another task last week but I'm back on this now and I'm having a bit of trouble. I'm not at all confident in C++ or python so I'd really appreciate any help you can give.


Using your MainWindow.xaml.cs in OpenFaceOffline gave quite a few compile errors on functions with the wrong number of parameters (and you were using doubles where it wants floats). I think I did a reasonable job of using the original file to fix those errors in your one and got it to compile and run, but when I go to File/Open Webcam it opens another window that says "Loading Webcams" then crashes after a few seconds.

Running inside VS in Debug mode gives the following exception:

System.ArgumentException: 'Value does not fall within the expected range.'

In OpenCVWrappers.h line 230

The line is return gcnew WriteableBitmap(Width, Height, 72, 72, Format, nullptr);

Width and Height are 0 and Format is Gray8


I also tried following the instructions under "Documentation & simple how to run" (https://github.com/NumesSanguis/FACSvatar) but haven't managed to get that to work either.


Then I found this page https://facsvatar.readthedocs.io/en/latest/quickrun.html:

NumesSanguis commented 6 years ago

Thank you for trying out my framework and sorry that the documentation is limited and outdated. In the 'multi-user' branch a lot has changed and I'm planning to push that to the master branch in 1 week. After that the project will be a bit less Alpha, so I would recommend waiting for that.

Since this is the OpenFace Github issues, it's best to describe here only problems related to compiling OpenFace. Please open for other problems an issue on my GitHub page.

Do you also get the webcam crash when you use the already complied version (https://github.com/TadasBaltrusaitis/OpenFace/releases)?

If @TadasBaltrusaitis is okay with it, I can upload a modified OpenFaceOffline.exe with ZeroMQ included, until a more robust version has been created?

SilentSin commented 6 years ago

OK cool, I'll wait for your update. I'll post an issue on your page if I still have trouble with it.

I do get the same crash with the precompiled version so I've posted a separate issue for that: https://github.com/TadasBaltrusaitis/OpenFace/issues/506.

NumesSanguis commented 6 years ago

MainWindow.xaml.cs has been updated for v2.0.3 and I included some install instructions in the file OpenFace_how-to-compile.txt: https://github.com/NumesSanguis/FACSvatar/tree/master/openface

Although this probably won't solve your other issue with OpenFace.

To go shortly into your other problems stated (please put any new problems in a GitHub issue at my repro), FACSvatar works in terms of modules, which usually receive data, do some modification, forward new data.

TadasBaltrusaitis commented 6 years ago

Integration of a messenger server is definitely on my todo list, but I have a number of things to do before that (stability and performance fixes). Realistically, I expect to build a proper support for ZMQ in both C++ and C# in September/October, but there's always a small chance it might come earlier.

marc101101 commented 5 years ago

@TadasBaltrusaitis Sorry in advance, but did you already implemented a messenger server?

TadasBaltrusaitis commented 5 years ago

Sorry, not yet.

jaysonjeg commented 4 years ago

@TadasBaltrusaitis , are you planning on implementing the real-time messaging server?

jaysonjeg commented 4 years ago

I'm going to be using OpenFace for a research study on facial expressiveness in mental illness, and currently using the FACSvatar modification to OpenFaceOffline, which sends ZeroMQ messages. But it would be better to have ZeroMQ functionality on the command line call to FeatureExtraction.exe, so I can turn Openface acquisition/messaging through an experimental code loop.

marc101101 commented 4 years ago

I build a really hacky solution. Had no time and experience with C++. What I did, was extending the lib/local/Utilities/src/RecorderOpenFace.cpp. The basic idea is to log the data and pipe it to a python script, which pushes the data in my case to a websocket. It worked ... on my system :)

lib/local/Utilities/src/RecorderOpenFace.cpp:

std::cout << "relevant_entry:" << "{"
        << "'face_id': " << face_id << ", "
        << "'frame_number': " << frame_number << ", " 
        << "'landmark_detection_success': " << landmark_detection_success << ", " 
        << "'landmark_detection_confidence': " << landmark_detection_confidence << ", " 
        << "'gaze_direction_0_x': " << gaze_direction0.x << ", " 
        << "'gaze_direction_0_y': " << gaze_direction0.y << ", " 
        << "'gaze_direction_0_z': " << gaze_direction0.z << ", " 
        << "'gaze_direction_1_x': " << gaze_direction1.x << ", "
        << "'gaze_direction_1_y': " << gaze_direction1.y << ", "
        << "'gaze_direction_1_z': " << gaze_direction1.z << ", "
        << "'gaze_angle_x': " << gaze_angle[0] << ", "
        << "'gaze_angle_y': " << gaze_angle[1] << ", "
        << "'pose_Tx': " << head_pose[0] << ", "
        << "'pose_Ty': " << head_pose[1] << ", "
        << "'pose_Tz': " << head_pose[2] << ", "
        << "'pose_Rx': " << head_pose[3] << ", "
        << "'pose_Ry': " << head_pose[4] << ", "
        << "'pose_Rz': " << head_pose[5] << ", ";

        for (int i = 0; i < eye_landmarks2D.size(); ++i)
        {
            std::cout << "'eye_lmk_x_" << i << "':" << eye_landmarks2D[i].x << ", ";
        }
        for (int i = 0; i < eye_landmarks2D.size(); ++i)
        {
            std::cout << "'eye_lmk_y_" << i << "':" << eye_landmarks2D[i].y << ", ";
        }

        for (int i = 0; i < eye_landmarks3D.size(); ++i)
        {
            std::cout << "'eye_lmk_X_" << i << "':" << eye_landmarks3D[i].x << ", ";
        }
        for (int i = 0; i < eye_landmarks3D.size(); ++i)
        {
            std::cout << "'eye_lmk_Y_" << i << "':" << eye_landmarks3D[i].y << ", ";
        }

        for (int i = 0; i < eye_landmarks3D.size(); ++i)
        {
            if(i == eye_landmarks3D.size()-1){
                std::cout << "'eye_lmk_Z_" << i << "':" << eye_landmarks3D[i].z;
            }
            else{
                std::cout << "'eye_lmk_Z_" << i << "':" << eye_landmarks3D[i].z << ", ";
            }
        }   

        std::cout << "}" <<  std::endl;

Python script:

#!/usr/bin/python
import sys
import socketio
import time
import ast

class ClientGazeLogger:

    client_name = ""
    sio = None

    def __init__(self, ip_address, client_name):
        self.client_name = client_name
        self.sio = socketio.Client()
        self.sio.connect('http://' + ip_address + ':5000')

        self.watch_data_stream()

    def watch_data_stream(self):
        k = 0

        try:
            buff = ''
            while True:
                buff += sys.stdin.read(1)
                if buff.startswith("relevant_entry"):
                    if buff.endswith('\n'):
                        print("Message received!" + str(time.time()))
                        message_to_push = str(buff)[15:]
                        message_to_push = ast.literal_eval(message_to_push)
                        message_to_push['timestamp'] = time.time()
                        message_to_push['client_id'] = self.client_name
                        try:
                            self.push_to_server(message_to_push)
                        except Exception as e:
                            print(e)
                        buff = ''
                        k = k + 1
                else:
                    if buff.endswith('\n'):
                        print(buff[:-1])
                        buff = ''

        except KeyboardInterrupt:
            sys.stdout.flush()
            pass
        print("End of Log: " + str(k))

    def push_to_server(self, message):
        self.sio.emit('message', message)

if __name__ == "__main__":
    print("INFO: Client  up and tracking")
    print("INFO: IP address - " + str(sys.argv[1]))
    print("INFO: Client name - " + str(sys.argv[2]))

    client = ClientGazeLogger(sys.argv[1], sys.argv[2])

Command to start everything:

OpenFace/build/bin/FeatureExtraction -tracked -q -device 0 -gaze | python3  ../../client/client.py $WEBSOCKET_IP $CURRENT_CAM_ID

WEBSOCKET_IP: The server ip in the network, which provides the websocket. CURRENT_CAM_ID: I used multiple cams, the identifier can be anything, or kick it out.

marc101101 commented 4 years ago

@jaysonjeg I used OpenFace a year ago for my master thesis, so I am not 100% up to date what changed since then. But following my hacky approach above you can also use zeromq instead of a socket.io (which I used).

jaysonjeg commented 4 years ago

@marc101101 , where inside RecorderOpenFace.cpp, did you add that?

marc101101 commented 4 years ago

@jaysonjeg https://github.com/TadasBaltrusaitis/OpenFace/blob/master/lib/local/Utilities/src/RecorderOpenFace.cpp#L341 here

NumesSanguis commented 4 years ago

@jaysonjeg Author of FACSvatar here. Did you get a ZeroMQ version in C++ to work? Would be great if users of OpenFace can use it in real-time without needing a Windows PC.

jaysonjeg commented 4 years ago

Hi, I'm afraid not. I ended up just using the FACSvatar version of OpenFace with ZeroMQ

On Sun., 16 Aug. 2020, 7:34 pm Stef | ステフ, notifications@github.com wrote:

@jaysonjeg https://github.com/jaysonjeg Author of FACSvatar here. Did you get a ZeroMQ version in C++ to work? Would be great if users of OpenFace can use it in real-time without needing a Windows PC.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/TadasBaltrusaitis/OpenFace/issues/492#issuecomment-674504114, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJ73VE3BELAG4PKPSV72HLSA6RZZANCNFSM4FGVMNCA .

ThomasJanssoone commented 2 years ago

I actually managed to add a simple zmq messaging process in the FeatureExtraction.cpp and updated the CMakeLists to compile it.

I still have some checks to do but I would be glad to share the whole code with you

Best

///////////////////////////////////////////////////////////////////////////////
// Copyright (C) 2017, Carnegie Mellon University and University of Cambridge,
// all rights reserved.
//
// ACADEMIC OR NON-PROFIT ORGANIZATION NONCOMMERCIAL RESEARCH USE ONLY
//
// BY USING OR DOWNLOADING THE SOFTWARE, YOU ARE AGREEING TO THE TERMS OF THIS LICENSE AGREEMENT.  
// IF YOU DO NOT AGREE WITH THESE TERMS, YOU MAY NOT USE OR DOWNLOAD THE SOFTWARE.
//
// License can be found in OpenFace-license.txt

//     * Any publications arising from the use of this software, including but
//       not limited to academic journal and conference publications, technical
//       reports and manuals, must cite at least one of the following works:
//
//       OpenFace 2.0: Facial Behavior Analysis Toolkit
//       Tadas Baltrušaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency
//       in IEEE International Conference on Automatic Face and Gesture Recognition, 2018  
//
//       Convolutional experts constrained local model for facial landmark detection.
//       A. Zadeh, T. Baltrušaitis, and Louis-Philippe Morency,
//       in Computer Vision and Pattern Recognition Workshops, 2017.    
//
//       Rendering of Eyes for Eye-Shape Registration and Gaze Estimation
//       Erroll Wood, Tadas Baltrušaitis, Xucong Zhang, Yusuke Sugano, Peter Robinson, and Andreas Bulling 
//       in IEEE International. Conference on Computer Vision (ICCV),  2015 
//
//       Cross-dataset learning and person-specific normalisation for automatic Action Unit detection
//       Tadas Baltrušaitis, Marwa Mahmoud, and Peter Robinson 
//       in Facial Expression Recognition and Analysis Challenge, 
//       IEEE International Conference on Automatic Face and Gesture Recognition, 2015 
//
///////////////////////////////////////////////////////////////////////////////

// FeatureExtraction.cpp : Defines the entry point for the feature extraction console application.

// Local includes
#include "LandmarkCoreIncludes.h"
//#include "zmq.h"
#include <Face_utils.h>
#include <FaceAnalyser.h>
#include <GazeEstimation.h>
#include <RecorderOpenFace.h>
#include <RecorderOpenFaceParameters.h>
#include <SequenceCapture.h>
#include <Visualizer.h>
#include <VisualizationUtils.h>

#define ZMQ_STATIC
#include <zmq.hpp>

#ifndef CONFIG_DIR
#define CONFIG_DIR "~"
#endif

#define INFO_STREAM( stream ) \
std::cout << stream << std::endl

#define WARN_STREAM( stream ) \
std::cout << "Warning: " << stream << std::endl

#define ERROR_STREAM( stream ) \
std::cout << "Error: " << stream << std::endl

static void printErrorAndAbort(const std::string & error)
{
    std::cout << error << std::endl;
}

#define FATAL_STREAM( stream ) \
printErrorAndAbort( std::string( "Fatal error: " ) + stream )

std::vector<std::string> get_arguments(int argc, char **argv)
{

    std::vector<std::string> arguments;

    // First argument is reserved for the name of the executable
    for (int i = 0; i < argc; ++i)
    {
        arguments.push_back(std::string(argv[i]));
    }
    return arguments;
}

int main(int argc, char **argv)
{

    std::vector<std::string> arguments = get_arguments(argc, argv);

    // no arguments: output usage
    if (arguments.size() == 1)
    {
        std::cout << "For command line arguments see:" << std::endl;
        std::cout << " https://github.com/TadasBaltrusaitis/OpenFace/wiki/Command-line-arguments";
        return 0;
    }

    // Load the modules that are being used for tracking and face analysis
    // Load face landmark detector
    LandmarkDetector::FaceModelParameters det_parameters(arguments);
    // Always track gaze in feature extraction
    LandmarkDetector::CLNF face_model(det_parameters.model_location);

    if (!face_model.loaded_successfully)
    {
        std::cout << "ERROR: Could not load the landmark detector" << std::endl;
        return 1;
    }

    // Load facial feature extractor and AU analyser
    FaceAnalysis::FaceAnalyserParameters face_analysis_params(arguments);
    FaceAnalysis::FaceAnalyser face_analyser(face_analysis_params);

    if (!face_model.eye_model)
    {
        std::cout << "WARNING: no eye model found" << std::endl;
    }

    if (face_analyser.GetAUClassNames().size() == 0 && face_analyser.GetAUClassNames().size() == 0)
    {
        std::cout << "WARNING: no Action Unit models found" << std::endl;
    }

    Utilities::SequenceCapture sequence_reader;

    // A utility for visualizing the results
    Utilities::Visualizer visualizer(arguments);

    // Tracking FPS for visualization
    Utilities::FpsTracker fps_tracker;
    fps_tracker.AddFrame();

    //  Prepare our context and socket
    zmq::context_t context(1);
    zmq::socket_t sock(context, ZMQ_PUB);
    //sock.connect("tcp://*:5556");
    sock.bind("tcp://*:5556");
    std::cout << "sock bind" << std::endl;
    int sock_connected;
    sock_connected  = zmq_connect(sock, "tcp://localhost:5556");
    std::cout << "sock connected" << sock_connected <<std::endl;

    while (true) // this is not a for loop as we might also be reading from a webcam
    {

        // The sequence reader chooses what to open based on command line arguments provided
        if (!sequence_reader.Open(arguments))
            break;

        INFO_STREAM("Device or file opened");

        if (sequence_reader.IsWebcam())
        {
            INFO_STREAM("WARNING: using a webcam in feature extraction, Action Unit predictions will not be as accurate in real-time webcam mode");
            INFO_STREAM("WARNING: using a webcam in feature extraction, forcing visualization of tracking to allow quitting the application (press q)");
            visualizer.vis_track = true;
        }

        cv::Mat captured_image;

        Utilities::RecorderOpenFaceParameters recording_params(arguments, true, sequence_reader.IsWebcam(),
            sequence_reader.fx, sequence_reader.fy, sequence_reader.cx, sequence_reader.cy, sequence_reader.fps);
        if (!face_model.eye_model)
        {
            recording_params.setOutputGaze(false);
        }
        Utilities::RecorderOpenFace open_face_rec(sequence_reader.name, recording_params, arguments);

        if (recording_params.outputGaze() && !face_model.eye_model)
            std::cout << "WARNING: no eye model defined, but outputting gaze" << std::endl;

        captured_image = sequence_reader.GetNextFrame();

        // For reporting progress
        double reported_completion = 0;

        INFO_STREAM("Starting tracking");
        while (!captured_image.empty())
        {
            // Converting to grayscale
            cv::Mat_<uchar> grayscale_image = sequence_reader.GetGrayFrame();

            // The actual facial landmark detection / tracking
            bool detection_success = LandmarkDetector::DetectLandmarksInVideo(captured_image, face_model, det_parameters, grayscale_image);

            // Gaze tracking, absolute gaze direction
            cv::Point3f gazeDirection0(0, 0, 0); cv::Point3f gazeDirection1(0, 0, 0); cv::Vec2d gazeAngle(0, 0);

            if (detection_success && face_model.eye_model)
            {
                GazeAnalysis::EstimateGaze(face_model, gazeDirection0, sequence_reader.fx, sequence_reader.fy, sequence_reader.cx, sequence_reader.cy, true);
                GazeAnalysis::EstimateGaze(face_model, gazeDirection1, sequence_reader.fx, sequence_reader.fy, sequence_reader.cx, sequence_reader.cy, false);
                gazeAngle = GazeAnalysis::GetGazeAngle(gazeDirection0, gazeDirection1);
            }

            // Do face alignment
            cv::Mat sim_warped_img;
            cv::Mat_<double> hog_descriptor; int num_hog_rows = 0, num_hog_cols = 0;

            // Perform AU detection and HOG feature extraction, as this can be expensive only compute it if needed by output or visualization
            if (recording_params.outputAlignedFaces() || recording_params.outputHOG() || recording_params.outputAUs() || visualizer.vis_align || visualizer.vis_hog || visualizer.vis_aus)
            {
                face_analyser.AddNextFrame(captured_image, face_model.detected_landmarks, face_model.detection_success, sequence_reader.time_stamp, sequence_reader.IsWebcam());
                face_analyser.GetLatestAlignedFace(sim_warped_img);
                face_analyser.GetLatestHOG(hog_descriptor, num_hog_rows, num_hog_cols);
            }

            // Work out the pose of the head from the tracked model
            cv::Vec6d pose_estimate = LandmarkDetector::GetPose(face_model, sequence_reader.fx, sequence_reader.fy, sequence_reader.cx, sequence_reader.cy);

            // Keeping track of FPS
            fps_tracker.AddFrame();
            std::vector<std::pair<std::string, double>> aus_pres_vec = face_analyser.GetCurrentAUsClass();
            std::vector<std::pair<std::string, double>> aus_intensity_vec = face_analyser.GetCurrentAUsReg();
            std::stringstream ss;
            for (size_t i = 0; i < aus_pres_vec.size(); ++i)
            {
                if (i != 0)
                    ss << ",";
                ss << aus_pres_vec[i].first << ": " << aus_pres_vec[i].second << "==" << aus_intensity_vec[i].second<< "--";
            }
            std::string aus_str = ss.str();
            /*zmq::message_t request(aus_str.size());
            memcpy(request.data(), aus_str.c_str(), aus_str.size());
            std::cout << "Dealer: Sending  " << aus_str << "…" << std::endl;
            sock.send(request);*/
            std::cout << aus_str  << std::endl;
            zmq::message_t reply(aus_str.size());
            memcpy(reply.data(), aus_str.c_str(), aus_str.size());
            sock.send(reply, zmq::send_flags::none);
            // Displaying the tracking visualizations
            visualizer.SetImage(captured_image, sequence_reader.fx, sequence_reader.fy, sequence_reader.cx, sequence_reader.cy);
            visualizer.SetObservationFaceAlign(sim_warped_img);
            visualizer.SetObservationHOG(hog_descriptor, num_hog_rows, num_hog_cols);
            visualizer.SetObservationLandmarks(face_model.detected_landmarks, face_model.detection_certainty, face_model.GetVisibilities());
            visualizer.SetObservationPose(pose_estimate, face_model.detection_certainty);
            visualizer.SetObservationGaze(gazeDirection0, gazeDirection1, LandmarkDetector::CalculateAllEyeLandmarks(face_model), LandmarkDetector::Calculate3DEyeLandmarks(face_model, sequence_reader.fx, sequence_reader.fy, sequence_reader.cx, sequence_reader.cy), face_model.detection_certainty);
            visualizer.SetObservationActionUnits(face_analyser.GetCurrentAUsReg(), face_analyser.GetCurrentAUsClass());
            visualizer.SetFps(fps_tracker.GetFPS());

            // detect key presses
            char character_press = visualizer.ShowObservation();

            // quit processing the current sequence (useful when in Webcam mode)
            if (character_press == 'q')
            {
                break;
            }

            // Setting up the recorder output
            open_face_rec.SetObservationHOG(detection_success, hog_descriptor, num_hog_rows, num_hog_cols, 31); // The number of channels in HOG is fixed at the moment, as using FHOG
            open_face_rec.SetObservationVisualization(visualizer.GetVisImage());
            open_face_rec.SetObservationActionUnits(face_analyser.GetCurrentAUsReg(), face_analyser.GetCurrentAUsClass());
            open_face_rec.SetObservationLandmarks(face_model.detected_landmarks, face_model.GetShape(sequence_reader.fx, sequence_reader.fy, sequence_reader.cx, sequence_reader.cy),
                face_model.params_global, face_model.params_local, face_model.detection_certainty, detection_success);
            open_face_rec.SetObservationPose(pose_estimate);
            open_face_rec.SetObservationGaze(gazeDirection0, gazeDirection1, gazeAngle, LandmarkDetector::CalculateAllEyeLandmarks(face_model), LandmarkDetector::Calculate3DEyeLandmarks(face_model, sequence_reader.fx, sequence_reader.fy, sequence_reader.cx, sequence_reader.cy));
            open_face_rec.SetObservationTimestamp(sequence_reader.time_stamp);
            open_face_rec.SetObservationFaceID(0);
            open_face_rec.SetObservationFrameNumber(sequence_reader.GetFrameNumber());
            open_face_rec.SetObservationFaceAlign(sim_warped_img);
            open_face_rec.WriteObservation();
            open_face_rec.WriteObservationTracked();

            // Reporting progress
            if (sequence_reader.GetProgress() >= reported_completion / 10.0)
            {
                std::cout << reported_completion * 10 << "% ";
                if (reported_completion == 10)
                {
                    std::cout << std::endl;
                }
                reported_completion = reported_completion + 1;
            }

            // Grabbing the next frame in the sequence
            captured_image = sequence_reader.GetNextFrame();

        }

        INFO_STREAM("Closing output recorder");
        open_face_rec.Close();
        INFO_STREAM("Closing input reader");
        sequence_reader.Close();
        INFO_STREAM("Closed successfully");

        if (recording_params.outputAUs())
        {
            INFO_STREAM("Postprocessing the Action Unit predictions");
            face_analyser.PostprocessOutputFile(open_face_rec.GetCSVFile());
        }

        // Reset the models for the next video
        face_analyser.Reset();
        face_model.Reset();

    }

    return 0;
}