denshoproject / ddr-cmdln

Command-line tools for automating the Densho Digital Repository's various processes.
Other
0 stars 2 forks source link

Access files not made for large files #111

Open gjost opened 5 years ago

gjost commented 5 years ago

Access files are not being made for large files. File ingest does not crash but no access files are created. 768MB seems to be the size at which problems start.

gjost commented 5 years ago

Tried resizing a large file (771MB) from CLI using the ImageMagick command used by DDR.imaging.thumbnail, which returned an error.

$ convert "/tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.tif"[0] -resize '1000x1000' /tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.jpg
convert: unable to extent pixel cache `No such file or directory' @ fatal/cache.c/CacheSignalHandler/3328.

Looks like RAM or disk space issues.

gjost commented 5 years ago

I can't even resize this on my laptop which has 16G RAM and lots of space on the SSD.

$ convert -monitor ./ddr-densho-332-49_Master_rev4_81.tif -resize '1000x1000' ddr-densho-332-49_Master_rev4_81.jpg
convert-im6.q16: Incompatible type for "RichTIFFIPTC"; tag ignored. `TIFFFetchNormalTag' @ warning/tiff.c/TIFFWarnings/912.
convert-im6.q16: DistributedPixelCache '127.0.0.1' @ error/distribute-cache.c/ConnectPixelCacheServer/244.
convert-im6.q16: cache resources exhausted `./ddr-densho-332-49_Master_rev4_81.tif' @ error/cache.c/OpenPixelCache/3945.
convert-im6.q16: no images defined `ddr-densho-332-49_Master_rev4_81.jpg' @ error/convert.c/ConvertImageCommand/3258.
$ convert -version
Version: ImageMagick 6.9.7-4 Q16 x86_64 20170114 http://www.imagemagick.org
Copyright: © 1999-2017 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Features: Cipher DPC Modules OpenMP 
Delegates (built-in): bzlib djvu fftw fontconfig freetype jbig jng jp2 jpeg lcms lqr ltdl lzma openexr pangocairo png tiff wmf x xml zlib
gjost commented 5 years ago

FWIW tried opening the TIFF in Gimp and got this:

TIFF Image Message.
Warning: The image you are loading has 16 bits per channel. GIMP can only handle 8 bit,
so it will be converted for you. Information will be lost because of this conversion.
gjost commented 5 years ago

Ran this in DDR shell, with some print statements in DDR.imaging.thumbnail:

$ python ddrlocal/manage.py shell
>>> from DDR import imaging
>>> src1 = '/tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.tif'
>>> dest1 = '/tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.jpg'
>>> geometry = '1000x1000'
>>> imaging.thumbnail(src1, dest1, geometry)
thumbnail(/tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.tif, /tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.jpg, 1000x1000)
analysis {'std_err': 'identify: Incompatible type for "RichTIFFIPTC"; tag ignored. `TIFFFetchNormalTag\' @ warning/tiff.c/TIFFWarnings/881.', 'std_out'
: '/tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.tif TIFF 10840x12414 10840x12414+0+0 16-bit sRGB 807.4MB 0.010u 0:00.010', 'format': 'TIFF', 'image'
: True, 'can_thumbnail': None, 'frames': 1, 'path': '/tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.tif'}
convert "/tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.tif"[0] -resize '1000x1000' /tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.jpg
<Response [convert]>
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    imaging.thumbnail(src1, dest1, geometry)
  File "/opt/ddr-local-develop/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/imaging.py", line 148, in thumbnail
    data['size'] = os.path.getsize(dest)
  File "/opt/ddr-local-develop/venv/ddrlocal/lib/python2.7/genericpath.py", line 49, in getsize
    return os.stat(filename).st_size
OSError: [Errno 2] No such file or directory: '/tmp/ddrshared/ddr-densho-332-49_Master_rev4_81.jpg'

I think this part is significant: 'identify: Incompatible type for "RichTIFFIPTC"; tag ignored. \'TIFFFetchNormalTag\' @ warning/tiff.c/TIFFWarnings/881.'

GeoffFroh commented 5 years ago

I think the message about the tag is not the issue -- looks like just a warning message that IM can't handle an optional IPTC tag. See: https://www.imagemagick.org/discourse-server/viewtopic.php?t=33838

gjost commented 5 years ago

Looks like it's possible to tell ImageMagick to use different memory allocations for large files. This fails:

$ convert ./ddr-densho-332-49_Master_rev4_81.tif -resize '1000x1000' ddr-densho-332-49_Master_rev4_81.jpg
convert-im6.q16: Incompatible type for "RichTIFFIPTC"; tag ignored. `TIFFFetchNormalTag' @ warning/tiff.c/TIFFWarnings/912.
convert-im6.q16: DistributedPixelCache '127.0.0.1' @ error/distribute-cache.c/ConnectPixelCacheServer/244.
convert-im6.q16: cache resources exhausted `./ddr-densho-332-49_Master_rev4_81.tif' @ error/cache.c/OpenPixelCache/3945.
convert-im6.q16: no images defined `ddr-densho-332-49_Master_rev4_81.jpg' @ error/convert.c/ConvertImageCommand/3258.

But this works (added -limit memory 2GB -limit map 4GB):

convert -limit memory 2GB -limit map 4GB ./ddr-densho-332-49_Master_rev4_81.tif -resize '1000x1000' ddr-densho-332-49_Master_rev4_81.jpg

Commit 882b9c6 makes it possible to add these commandline options to ddrlocal-local.cfg's [cmdln]access_file_options.

gjost commented 5 years ago

I'm not thinking it might be better to just add these flags automatically if the image is above a certain size (e.g. 768MB).

gjost commented 5 years ago

I think the message about the tag is not the issue -- looks like just a warning message that IM can't handle an optional IPTC tag. See: https://www.imagemagick.org/discourse-server/viewtopic.php?t=33838

Makes sense

gjost commented 5 years ago

Should be fixed in commit 259ec9d.

GeoffFroh commented 5 years ago

Need to re-verify issue

gjost commented 5 years ago

The memory limit flags now return the same error as without the flags:

$ convert -limit memory 2GB -limit map 4GB ./ddr-densho-332-49_Master_rev4_81.tif -resize '1000x1000' ddr-densho-332-49_Master_rev4_81.jpg
convert-im6.q16: Incompatible type for "RichTIFFIPTC"; tag ignored. `TIFFFetchNormalTag' @ warning/tiff.c/TIFFWarnings/912.
convert-im6.q16: DistributedPixelCache '127.0.0.1' @ error/distribute-cache.c/ConnectPixelCacheServer/244.
convert-im6.q16: cache resources exhausted `./ddr-densho-332-49_Master_rev4_81.tif' @ error/cache.c/OpenPixelCache/3945.
convert-im6.q16: no images defined `ddr-densho-332-49_Master_rev4_81.jpg' @ error/convert.c/ConvertImageCommand/3258.

Was my Oct 4 comment written when I was still developing on Deb8?

gjost commented 5 years ago

ImageMagick/ImageMagick: Memory Issues explains how settings in /etc/ImageMagick-6/policy.xml restrict the amount of cache is available. I bet Debian tightened up the limits in response to the recent Imagemagick DDoS bug.

I tried commenting out the policy settings and was able to successfully create a JPG. I'll include a customized policy.xml doc in the Makefile install-ddr-cmdln task and in the packaging task.

gjost commented 5 years ago

Fixed as of ddr-cmdln commit bd37fa7 and ddr-local commit 85bd660.

pkikawa commented 4 years ago

111reopen.zip

failing with PermissionError(13, 'Permission denied') on large binaries only

pkikawa commented 4 years ago

comparing it to an old "known good" version and i don't see any differences

oldknowngoodPolicy.zip