Open vpv-csc opened 1 week ago
I see, it's the same issue that someone else had reported... but without a trace and how it got there, it wasn't something I was ready to look at.
So with the man 2 select
does that imply you're running a lot of processes at the same time? Like using many file descriptors across multiple processes on the same machine?
I will consider looking into poll
and epoll
... I see the the Python select module supports them, but it's not a straightforward drop-in replacement... and we'd probably have to run some benchmarks to see if the whole call chain gets slower. select.select is pretty fast AFAIK
So with the
man 2 select
does that imply you're running a lot of processes at the same time? Like using many file descriptors across multiple processes on the same machine?
Yes. We often have thousands of image files in a single Submission Information Package (it's either a zip or a tar file with metadata in an XML file).
I was not able to reproduce this on my workstation. I created a directory with 4096 JPEG files, but running pyexiftool in a loop so that it takes one file at a time (i.e. not giving it a list of filenames) like we use it in our file-scraper did not cause this issue. It might be that you have to be (un)lucky enough to get an FD > 1024 for this to happen.
Our ability to test in production is somewhat limited but we'll see what we can do. I think we could count FDs and list the largest FD numbers while pyexiftool is running.
So with the
man 2 select
does that imply you're running a lot of processes at the same time? Like using many file descriptors across multiple processes on the same machine?Yes. We often have thousands of image files in a single Submission Information Package (it's either a zip or a tar file with metadata in an XML file).
I was not able to reproduce this on my workstation. I created a directory with 4096 JPEG files, but running pyexiftool in a loop so that it takes one file at a time (i.e. not giving it a list of filenames) like we use it in our file-scraper did not cause this issue. It might be that you have to be (un)lucky enough to get an FD > 1024 for this to happen.
Our ability to test in production is somewhat limited but we'll see what we can do. I think we could count FDs and list the largest FD numbers while pyexiftool is running.
@vpv-csc if you can reproduce this please try running:
$ strace -fo strace.out python3 test.py
And check that all opened FDs are properly closed. I can't reproduce this locally and all FDs are closed properly.
Still can't reproduce the bug in our specific use case, but I can reproduce it with:
$ ulimit -n 8192
$ mkdir files
$ for i in {0..4096}; do echo $i > files/$i; done
$ python3
>>> import exiftool
>>> exiftool.ExifToolHelper().get_metadata("files/1")
[{'SourceFile': 'files/1', 'ExifTool:ExifToolVersion': 12.7, 'File:FileName': 1, 'File:Directory': 'files', 'File:FileSize': 2, 'File:FileModifyDate': '2024:09:19 10:36:18+00:00', 'File:FileAccessDate': '2024:09:23 10:46:32+00:00', 'File:FileInodeChangeDate': '2024:09:19 10:36:18+00:00', 'File:FilePermissions': 100644, 'File:FileType': 'TXT', 'File:FileTypeExtension': 'TXT', 'File:MIMEType': 'text/plain', 'File:MIMEEncoding': 'us-ascii', 'File:Newlines': '\n', 'File:LineCount': 1, 'File:WordCount': 1}]
>>> files = []
>>> for i in range(4096):
... files.append(open(f"files/{i}"))
...
>>> exiftool.ExifToolHelper().get_metadata("files/1")
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 812, in run
self._ver = self._parse_ver()
File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 1199, in _parse_ver
return self.execute("-ver").strip()
File "/usr/lib/python3.9/site-packages/exiftool/helper.py", line 132, in execute
result: Union[str, bytes] = super().execute(*str_bytes_params, **kwargs)
File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 1009, in execute
raw_stdout = _read_fd_endswith(fdout, seq_ready.encode(self._encoding), self._block_size)
File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 142, in _read_fd_endswith
inputready, outputready, exceptready = select.select([fd], [], [])
ValueError: filedescriptor out of range in select()
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/site-packages/exiftool/helper.py", line 293, in get_metadata
return self.get_tags(files, None, params=params)
File "/usr/lib/python3.9/site-packages/exiftool/helper.py", line 378, in get_tags
ret = self.execute_json(*exec_params)
File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 1127, in execute_json
result = self.execute("-j", *params) # stdout
File "/usr/lib/python3.9/site-packages/exiftool/helper.py", line 120, in execute
self.run()
File "/usr/lib/python3.9/site-packages/exiftool/helper.py", line 150, in run
super().run()
File "/usr/lib/python3.9/site-packages/exiftool/exiftool.py", line 816, in run
raise ExifToolVersionError(f"Error retrieving Exiftool info. Is your Exiftool version ('exiftool -ver') >= required version ('{constants.EXIFTOOL_MINIMUM_VERSION}')?")
exiftool.exceptions.ExifToolVersionError: Error retrieving Exiftool info. Is your Exiftool version ('exiftool -ver') >= required version ('12.15')?
Fixed here: https://github.com/sylikc/pyexiftool/pull/98.
We are doing digital preservation. In some cases we are scraping metadata from thousands of image files in the same python process. As far as I understand, pyexiftool handles multiple files in the
-stay_open
mode. We are seeing theValueError: filedescriptor out of range in select()
error a lot in production.If you happen to be interested in the check-sip-digital-objects(-3) command seen in the backtrace, that's here: https://github.com/Digital-Preservation-Finland/dpres-ipt And our scraping tool is here: https://github.com/Digital-Preservation-Finland/file-scraper/
We are running version 0.5.5 that we packaged ourselves. It seems that 0.5.6 does not change anything related to this issue. Exiftool is 12.70.
Someone has reported this same issue here earlier: https://exiftool.org/forum/index.php?topic=11067.0
man 2 select
says