In testing rapids branch on UI and terminal, I experience the following Error preventing matches from being generated and also semantic search index from being prepared.
[2022-03-30 20:03:39,020: INFO] [luigi-interface] Running Worker with 1 processes
[2022-03-30 20:03:39,020: INFO] [luigi-interface] [pid 10041] Worker Worker(salt=146995060, workers=1, host=f90e167d6ad7, username=root, pid=10041) running CondenseFingerprintsTask(config=Config(sources=SourcesConfig(root='data/', extensions=['mp4', 'ogv', 'webm', 'avi', 'flv', 'mkv'], hash_mode='file', hash_cache='data/representations/hashes'), repr=RepresentationConfig(directory='data/representations', storage_type=<StorageType.DETECT: 'detect'>), database=DatabaseConfig(use=True, uri='postgresql://postgres:admin@postgres:5432/videodeduplicationdb'), processing=ProcessingConfig(video_list_filename='video_dataset_list.txt', match_distance=0.75, filter_dark_videos=True, filter_dark_videos_thr=2, min_video_duration_seconds=3, detect_scenes=True, minimum_scene_duration=2, pretrained_model_local_path=None, frame_sampling=1, save_frames=False, keep_fileoutput=True), templates=TemplatesConfig(source_path='data/templates/', distance=0.07, distance_min=0.05, override=False, extensions=('png', 'jpg', 'jpeg')), security=SecurityConfig(master_key_path=None), file_storage=FileStorageConfig(directory='file-storage'), logging=LoggingConfig(file_path='./processing_error.log', file_format='[%(asctime)s: %(levelname)s] [%(name)s] %(message)s', file_level=<LogLevel.ERROR: 40>, console_format='[%(asctime)s: %(levelname)s] %(message)s', console_level=<LogLevel.INFO: 20>)), prefix=., fingerprint_size=500)
[2022-03-30 20:03:39,021: INFO] [winnow.utils.repr] Detected simple path-based repr-storage in /project/data/representations/frames
[2022-03-30 20:03:39,022: INFO] [winnow.utils.repr] Detected simple path-based repr-storage in /project/data/representations/frame_level
[2022-03-30 20:03:39,023: INFO] [winnow.utils.repr] Detected simple path-based repr-storage in /project/data/representations/video_level
[2022-03-30 20:03:39,023: INFO] [winnow.utils.repr] Detected simple path-based repr-storage in /project/data/representations/video_signatures
[2022-03-30 20:03:39,184: INFO] [task.CondenseFingerprintsTask] Reading existing condensed fingerprints
[2022-03-30 20:03:39,184: INFO] [task.CondenseFingerprintsTask] Loaded 0 previously condensed fingerprints
[2022-03-30 20:03:39,262: INFO] [task.CondenseFingerprintsTask] Collecting file-keys since the very beginning
[2022-03-30 20:03:39,415: INFO] [task.CondenseFingerprintsTask] Collected 444 file keys
[2022-03-30 20:03:39,415: INFO] [task.CondenseFingerprintsTask] Reading fingerprints
[2022-03-30 20:03:39,416: INFO] [winnow.utils.repr] Detected simple path-based repr-storage in /project/data/representations/frames
[2022-03-30 20:03:39,416: INFO] [winnow.utils.repr] Detected simple path-based repr-storage in /project/data/representations/frame_level
[2022-03-30 20:03:39,417: INFO] [winnow.utils.repr] Detected simple path-based repr-storage in /project/data/representations/video_level
[2022-03-30 20:03:39,418: INFO] [winnow.utils.repr] Detected simple path-based repr-storage in /project/data/representations/video_signatures
[2022-03-30 20:03:39,687: INFO] [task.CondenseFingerprintsTask] Creating ndarray with fingerprints
[2022-03-30 20:03:39,687: INFO] [task.CondenseFingerprintsTask] Creating file-keys DataFrame
[2022-03-30 20:03:39,688: INFO] [task.CondenseFingerprintsTask] Loaded 444 new fingerprints.
[2022-03-30 20:03:39,688: INFO] [task.CondenseFingerprintsTask] Writing 444 fingerprints to ['data/representations/condensed_fingerprints/condensed_fingerprints__2022_03_25_192103493531.npy', 'data/representations/condensed_fingerprints/condensed_fingerprints__2022_03_25_192103493531.files.csv']
[2022-03-30 20:03:41,690: ERROR] [luigi-interface] [pid 10041] Worker Worker(salt=146995060, workers=1, host=f90e167d6ad7, username=root, pid=10041) failed CondenseFingerprintsTask(config=Config(sources=SourcesConfig(root='data/', extensions=['mp4', 'ogv', 'webm', 'avi', 'flv', 'mkv'], hash_mode='file', hash_cache='data/representations/hashes'), repr=RepresentationConfig(directory='data/representations', storage_type=<StorageType.DETECT: 'detect'>), database=DatabaseConfig(use=True, uri='postgresql://postgres:admin@postgres:5432/videodeduplicationdb'), processing=ProcessingConfig(video_list_filename='video_dataset_list.txt', match_distance=0.75, filter_dark_videos=True, filter_dark_videos_thr=2, min_video_duration_seconds=3, detect_scenes=True, minimum_scene_duration=2, pretrained_model_local_path=None, frame_sampling=1, save_frames=False, keep_fileoutput=True), templates=TemplatesConfig(source_path='data/templates/', distance=0.07, distance_min=0.05, override=False, extensions=('png', 'jpg', 'jpeg')), security=SecurityConfig(master_key_path=None), file_storage=FileStorageConfig(directory='file-storage'), logging=LoggingConfig(file_path='./processing_error.log', file_format='[%(asctime)s: %(levelname)s] [%(name)s] %(message)s', file_level=<LogLevel.ERROR: 40>, console_format='[%(asctime)s: %(levelname)s] %(message)s', console_level=<LogLevel.INFO: 20>)), prefix=., fingerprint_size=500)
Traceback (most recent call last):
File "/anaconda/envs/winnow/lib/python3.9/site-packages/luigi/worker.py", line 191, in run
new_deps = self._run_get_new_deps()
File "/anaconda/envs/winnow/lib/python3.9/site-packages/luigi/worker.py", line 133, in _run_get_new_deps
task_gen = self.task.run()
File "/project/winnow/pipeline/luigi/condense.py", line 263, in run
target.write(condensed, new_results_time)
File "/project/winnow/pipeline/luigi/condense.py", line 206, in write
condensed.file_keys_df.to_csv(keys_out)
File "/anaconda/envs/winnow/lib/python3.9/site-packages/pandas/core/generic.py", line 3466, in to_csv
return DataFrameRenderer(formatter).to_csv(
File "/anaconda/envs/winnow/lib/python3.9/site-packages/pandas/io/formats/format.py", line 1105, in to_csv
csv_formatter.save()
File "/anaconda/envs/winnow/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 257, in save
self._save()
File "/anaconda/envs/winnow/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 262, in _save
self._save_body()
File "/anaconda/envs/winnow/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 300, in _save_body
self._save_chunk(start_i, end_i)
File "/anaconda/envs/winnow/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 311, in _save_chunk
libwriters.write_csv_rows(
File "pandas/_libs/writers.pyx", line 55, in pandas._libs.writers.write_csv_rows
TypeError: write() argument must be str, not bytes
[2022-03-30 20:03:41,698: ERROR] [task_queue.tasks] Error occurred while executing luigi tasks: CondenseFingerprintsTask(config=Config(sources=SourcesConfig(root='data/', extensions=['mp4', 'ogv', 'webm', 'avi', 'flv', 'mkv'], hash_mode='file', hash_cache='data/representations/hashes'), repr=RepresentationConfig(directory='data/representations', storage_type=<StorageType.DETECT: 'detect'>), database=DatabaseConfig(use=True, uri='postgresql://postgres:admin@postgres:5432/videodeduplicationdb'), processing=ProcessingConfig(video_list_filename='video_dataset_list.txt', match_distance=0.75, filter_dark_videos=True, filter_dark_videos_thr=2, min_video_duration_seconds=3, detect_scenes=True, minimum_scene_duration=2, pretrained_model_local_path=None, frame_sampling=1, save_frames=False, keep_fileoutput=True), templates=TemplatesConfig(source_path='data/templates/', distance=0.07, distance_min=0.05, override=False, extensions=('png', 'jpg', 'jpeg')), security=SecurityConfig(master_key_path=None), file_storage=FileStorageConfig(directory='file-storage'), logging=LoggingConfig(file_path='./processing_error.log', file_format='[%(asctime)s: %(levelname)s] [%(name)s] %(message)s', file_level=<LogLevel.ERROR: 40>, console_format='[%(asctime)s: %(levelname)s] %(message)s', console_level=<LogLevel.INFO: 20>)), prefix=., fingerprint_size=500), write() argument must be str, not bytes
[2022-03-30 20:03:41,705: INFO] [luigi-interface] Informed scheduler that task CondenseFingerprintsTask_Config_sources_S_500___8510f14eba has status FAILED
[2022-03-30 20:03:41,709: INFO] [luigi-interface]
===== Luigi Execution Summary =====
Scheduled 5 tasks of which:
* 2 complete ones were encountered:
- 1 ExifTask(...)
- 1 SignaturesTask(...)
* 1 failed:
- 1 CondenseFingerprintsTask(...)
* 2 were left pending, among these:
* 2 had failed dependencies:
- 1 AnnoyIndexTask(...)
- 1 DBMatchesTask(...)
In testing rapids branch on UI and terminal, I experience the following Error preventing matches from being generated and also semantic search index from being prepared.