dlib's compute_face_descriptor failed with pycuda.autoinit

congphase commented 4 years ago

Expected Behavior

I wrote a python script that takes an image containing an obvious face, then performs face detection with a method called LFFD, the resulted bounding box coordinates are passed to face_recognition.face_encodings(), which is somehow dlib's compute_face_descriptor() to get its encoding.

Current Behavior

The resulted bounding box coordinates were right (I draw rectangle to check). But when passing them to face_encodings() (I've made sure having changed the coordinates to css type, which face_encodings needs), this "Segmentation fault (core dump)" occurred, every single time.

Traceback (most recent call last): File "predict_tensorrt_video.py", line 673, in main() File "predict_tensorrt_video.py", line 88, in inner retval = fnc(*args, **kwargs) File "predict_tensorrt_video.py", line 667, in main run_inference(args.video_in, args.video_out, candidate_id, current_time) File "predict_tensorrt_video.py", line 565, in run_inference face_encoding = get_face_encodings(frame, css_type_face_location, 0)[0] File "/home/gate/.virtualenvs/lffd/lib/python3.6/site-packages/face_recognition/api.py", line 210, in face_encodings return [np.array(face_encoder.compute_face_descriptor(face_image, raw_landmark_set, num_jitters)) for raw_landmark_set in raw_landmarks] File "/home/gate/.virtualenvs/lffd/lib/python3.6/site-packages/face_recognition/api.py", line 210, in return [np.array(face_encoder.compute_face_descriptor(face_image, raw_landmark_set, num_jitters)) for raw_landmark_set in raw_landmarks] RuntimeError: Error while calling cudnnConvolutionForward( context(), &alpha, descriptor(data), data.device(), (const cudnnFilterDescriptor_t)filter_handle, filters.device(), (const cudnnConvolutionDescriptor_t)conv_handle, (cudnnConvolutionFwdAlgo_t)forward_algo, forward_workspace, forward_workspace_size_in_bytes, &beta, descriptor(output), output.device()) in file /home/gate/dlib-19.17/dlib/cuda/cudnn_dlibapi.cpp:1007. code: 7, reason: A call to cuDNN failed Segmentation fault (core dumped)

I then tried removing LFFD from the code and replaced with dlib's cnn detector, everything else is left as before (which means only dlib's functions are present), but weirdly, that same error message occurred. Then, I kept the code, I tried commenting each import statement being present in the code then ran and observed the output. That weird error occurred several times before import pycuda.autoinit (which is firstly meant for LFFD) was commented out, and everything ran perfectly.

Steps to Reproduce

Just type these code:

import cv2
import face_recognition

import pycuda.autoinit

test_img = cv2.imread(<path to the image having a face>)
cnn_detector = dlib.cnn_face_detection_model_v1(<path to the dlib's cnn face detection model>)
test_img_dlib = cv2.cvtColor(test_img, cv2.COLOR_BGR2RGB)
face_locations = cnn_detector(test_img_dlib, 0)

face_location = face_locations[0]
bb_left = face_location.rect.left()
bb_top = face_location.rect.top()
bb_right = face_location.rect.right()
bb_bottom = face_location.rect.bottom()

css_type_face_location = [(bb_top, bb_right, bb_bottom, bb_left)]

face_encoding = face_recognition.face_encodings(test_img_dlib, css_type_face_location, 0)[0]

print(f'Result:\n{face_encoding}')

Version: 19.17
Where did you get dlib: http://dlib.net/files/
Platform: I used NVIDIA Jetson Nano:
- Type: NANO/TX1
- Jetpack: 4.2.1 [L4T 32.2.0]
- GPU-Arch: 5.3
- Libraries:
- CUDA: 10.0.326
- cuDNN: 7.5.0.56-1+cuda10.0
- TensorRT: 5.1.6.1-1+cuda10.0
- VisionWorks: 1.6.0.500n
- OpenCV: 4.1.0 compiled CUDA: YES

lsb_release -a output: No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.3 LTS Release: 18.04 Codename: bionic

Compiler: Python 3.6.8

Note: I compiled dlib with a line commented before compiling as mentioned in this to avoid silent bug in Jetson Nano board.

dlib-issue-bot commented 4 years ago

Warning: this issue has been inactive for 35 days and will be automatically closed on 2020-01-21 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

dlib-issue-bot commented 4 years ago

Warning: this issue has been inactive for 42 days and will be automatically closed on 2020-01-21 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

dlib-issue-bot commented 4 years ago

Notice: this issue has been closed because it has been inactive for 45 days. You may reopen this issue if it has been closed in error.

davisking / dlib