Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retrieval (Lerner et al., ECIR'24)
0%| | 0/20 [00:00<?, ?ba/s]/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/facenet_pytorch/models/utils/detect_face.py:183: VisibleDeprecationWarning: Cr[0/1944]
n ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'd
type=object' when creating the ndarray.
batch_boxes, batch_points = np.array(batch_boxes), np.array(batch_points)
/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/facenet_pytorch/models/mtcnn.py:339: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a
list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
boxes = np.array(boxes)
/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/facenet_pytorch/models/mtcnn.py:340: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a
list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
probs = np.array(probs)
/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/facenet_pytorch/models/mtcnn.py:341: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a
list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
points = np.array(points)
0%| | 0/20 [00:12<?, ?ba/s]
Traceback (most recent call last):
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/data/meerqat/ViQuAE/meerqat/image/face_detection.py", line 179, in <module>
dataset = dataset_detect_faces(dataset, model=model, image_key=image_key, save_root_path=save_root_path)
File "/home/data/meerqat/ViQuAE/meerqat/image/face_detection.py", line 146, in dataset_detect_faces
dataset = dataset.map(dataset_detect_face, batched=True, fn_kwargs=kwargs, batch_size=batch_size)
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/arrow_dataset.py", line 2590, in map
desc=desc,
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/arrow_dataset.py", line 584, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/arrow_dataset.py", line 551, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/fingerprint.py", line 480, in wrapper
out = func(self, *args, **kwargs)
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/arrow_dataset.py", line 2985, in _map_single
writer.write_batch(batch)
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/arrow_writer.py", line 524, in write_batch
arrays.append(pa.array(typed_sequence))
File "pyarrow/array.pxi", line 229, in pyarrow.lib.array
File "pyarrow/array.pxi", line 110, in pyarrow.lib._handle_arrow_array_protocol
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/arrow_writer.py", line 182, in __arrow_array__
out = list_of_np_array_to_pyarrow_listarray(data)
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/features/features.py", line 1350, in list_of_np_array_to_pyarrow_listarray
[numpy_to_pyarrow_listarray(arr, type=type) if arr is not None else None for arr in l_arr]
File "/home/users/oadjali/.conda/envs/v100/lib/python3.7/site-packages/datasets/features/features.py", line 1342, in list_of_pa_arrays_to_pyarrow_listarray
values = pa.concat_arrays(l_arr)
File "pyarrow/array.pxi", line 2526, in pyarrow.lib.concat_arrays
File "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: arrays to be concatenated must be identically typed, but float and null were encountered.
face_recognition
for this we simply need to check if face embedding/landmark/… is empty instead of None
Current
image.face_detection
andimage.face_recognition
scripts work with:datasets
version: 1.8.0but fail with:
datasets
version: 2.4.0This seems related to this issue https://github.com/huggingface/datasets/issues/3676
cc @OA256864 @grimalPaul
traceback for face_detection
face_recognition
for this we simply need to check if face embedding/landmark/… is empty instead of
None