benetech / VideoDeduplication

GNU General Public License v3.0
34 stars 12 forks source link

Investigate extract_features error #192

Open johnhbenetech opened 3 years ago

johnhbenetech commented 3 years ago

Searching for Dataset Video Files Number of files found: 995638 Traceback (most recent call last): File "extract_features.py", line 111, in main() File "/anaconda/envs/winnow/lib/python3.6/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/anaconda/envs/winnow/lib/python3.6/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/anaconda/envs/winnow/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/anaconda/envs/winnow/lib/python3.6/site-packages/click/core.py", line 610, in invoke return callback(args, **kwargs) File "extract_features.py", line 63, in main remaining_videos_path = [path for path in videos if not reps.frame_level.exists(storepath(path), get_hash(path))] File "extract_features.py", line 63, in remaining_videos_path = [path for path in videos if not reps.frame_level.exists(storepath(path), get_hash(path))] File "/project/winnow/utils/utils.py", line 151, in get_hash with open(fp, 'rb') as f:FileNotFoundError: [Errno 2] No such file or directory: '/project/data/s3/yt2020-low/-I2zOYQD-8Q/-I2zOYQD-8Q.mp4'

johnhbenetech commented 3 years ago

As we discussed this may be related to spottiness of s3fs. So while a file may be findable at one point, it may disappear later.

Can we check on this and add error handling to any place where a file is accessed?