The problem with detection faces in another thread

axmadjon commented 6 years ago

Version: 19.7
Where did you get dlib: pip install dlib
Platform: Ubuntu 16.04 x64 Intel® Core™ i7-7700 CPU @ 3.60GHz × 8 GeForce GTX 1060 6GB/PCIe/SSE2
Compiler: compiled python

I have a problem, I can not find a solution in GOOGLE the problem is installed "DLIB" with support "CUDA, Intel Blas"

Face detection works on the model "cnn"

When I try to detection faces IN ANOTHER THREADS, he gives me an error RuntimeError: Error while calling cudaGetDevice (& the_device_id) in file /tmp/pip-build-Bm_q1t/dlib/dlib/dnn/gpu_data.cpp:178. code: 3, reason: initialization

but I initialize the objects in the main thread, and give the instance to another thread cnn_face_detection_model = face_recognition_models.cnn_face_detector_model_location() cnn_face_detector = dlib.cnn_face_detection_model_v1(cnn_face_detection_model)

work as a "multiprocessing.Process"

But when I work in the main thread then it works fine

how can I solve such a problem

davisking commented 6 years ago

Post a minimal program that reproduces the problem.

axmadjon commented 6 years ago

import multiprocessing as mp

import cv2
import dlib
import numpy as np
import requests

try:
    import face_recognition_models
except:
    print("Please install `face_recognition_models` with this command before using `face_recognition`:")
    print()
    print("pip install git+https://github.com/ageitgey/face_recognition_models")
    quit()

def trim_css_to_bounds(css, image_shape):
    return max(css[0], 0), min(css[1], image_shape[1]), min(css[2], image_shape[0]), max(css[3], 0)

def rect_to_css(rect):
    return rect.top(), rect.right(), rect.bottom(), rect.left()

class Dlib():
    def __init__(self):
        self.cnn_face_detection_model = face_recognition_models.cnn_face_detector_model_location()
        self.cnn_face_detector = dlib.cnn_face_detection_model_v1(self.cnn_face_detection_model)

    def detect_face_in_dlib(self, image):
        cnn_detect = self.cnn_face_detector(image, 1)
        return [trim_css_to_bounds(rect_to_css(face.rect), image.shape) for face in cnn_detect]

    def detect(self, frame, event_listener, locations_user):
        face_locations = self.detect_face_in_dlib(image=frame)

        del locations_user[:]

        for location in face_locations:
            locations_user.append(location)

        event_listener.clear()

def main():
    stream = requests.get('http://192.168.10.29:5000/camera_stream?camera_id=0', stream=True)
    frame_interval = 8
    count = 0
    bytes = b''

    # detector = facenet.DlibDetect(detect_type='cnn')

    detect_dlib = Dlib()

    event = mp.Event()
    queue = mp.Manager()
    detected_users = queue.list()

    while True:
        bytes += stream.raw.read(40960)
        a = bytes.find(b'\xff\xd8')
        b = bytes.find(b'\xff\xd9')

        if a != -1 and b != -1:
            jpg = bytes[a:b + 2]
            bytes = bytes[b + 2:]
            image = cv2.imdecode(np.fromstring(jpg, dtype=np.uint8), cv2.IMREAD_COLOR)

            if count % frame_interval == 0 and not event.is_set():
                print(str(count % frame_interval))
                event.set()
                # detect_dlib.detect(image, event, detected_users)
                mp.Process(target=detect_dlib.detect, args=(image, event, detected_users,)).start()

            for (top, right, bottom, left) in detected_users:
                cv2.rectangle(image, (left, top), (right, bottom), (0, 0, 255), 2)

            cv2.imshow('Video', image)

            count += 1

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

if __name__ == '__main__':
    main()

axmadjon commented 6 years ago

if you call line 72 (detect_dlib.detect) a method works fine and so call method on mp.Process(line 73) gives an error RuntimeError: Error while calling cudaGetDevice(&the_device_id) in file /tmp/pip-build-Bm_q1t/dlib/dlib/dnn/gpu_data.cpp:178. code: 3, reason: initialization error

davisking commented 6 years ago

Python's subprocess parallelism is pretty ill conceived in general. The documentation for what it really does is pretty vague, but it seems like it just calls fork() and then continues on as normal, which is just wrong. You can't just go forking processes and then expect hardware bound resources (e.g. a GPU) to still be accessible from the forked process.

Anyway, don't do this. You shouldn't need to do it anyway since the GPU code is already parallelized anyway.

axmadjon commented 6 years ago

@davisking Thanks!

anujonthemove commented 6 years ago

@davisking I agree that it is already parallelized but I am trying to spawn multiple processes using python's multiprocessing.Process() but it throws the same "RuntimeError" as mentioned above. Each of these process is running the same piece of code but with different input videos/images.

davisking / dlib

The problem with detection faces in another thread #1013