jwilk / didjvu

DjVu encoder with foreground/background separation
https://jwilk.net/software/didjvu
GNU General Public License v2.0
10 stars 8 forks source link

RuntimeError: std::bad_alloc #6

Closed jwilk closed 10 years ago

jwilk commented 10 years ago

Issue reported by @jsbien:

Cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=743677

BTW, my intention is to make a completely lossless convertions, that's why I use --fg-subsample 1 --bg-subsample 1 (I hope I understand correctly the meaning of this options).

jwilk commented 10 years ago

It was reported that when processing multiple pages, didjvu can run out of memory:

$ didjvu bundle --fg-subsample 1 --bg-subsample 1 -o Kazania_1-1.djvu *.jpg
0015.jpg:
- reading image
- converting to DjVu
  BG44=/tmp/didjvu.9mRyH8.iw44 --> "/tmp/didjvu.9mRyH8.iw44" (192953 bytes)
  BG44=/tmp/didjvu.8JrMo7.iw44 --> "/tmp/didjvu.8JrMo7.iw44" (273246 bytes)
- 0.413 bits/pixel; 2.337:1, 57.21% saved, 3179253 bytes in, 1360504 bytes out
[…]
0019.jpg:
- reading image
- converting to DjVu
Traceback (most recent call last):
  File "/usr/bin/didjvu", line 9, in <module>
    didjvu.main()
  File "/usr/share/didjvu/lib/didjvu.py", line 202, in __init__
    parser.parse_args(actions=self)
  File "/usr/share/didjvu/lib/cli.py", line 217, in parse_args
    return action(o)
  File "/usr/share/didjvu/lib/didjvu.py", line 343, in bundle
    self.bundle_simple(o)
  File "/usr/share/didjvu/lib/didjvu.py", line 372, in bundle_simple
    parallel_for(o, self._bundle_simple_page, o.input, o.masks, component_filenames)
  File "/usr/share/didjvu/lib/didjvu.py", line 58, in parallel_for
    f(o, *args)
  File "/usr/share/didjvu/lib/didjvu.py", line 359, in _bundle_simple_page
    self.encode_one(o, input, mask, component, None)
  File "/usr/share/didjvu/lib/didjvu.py", line 286, in encode_one
    djvu_doc = image_to_djvu(width, height, image, mask, options=o)
  File "/usr/share/didjvu/lib/didjvu.py", line 157, in image_to_djvu
    bg_djvu = make_layer(image, mask, subsample_bg, options.bg_options)
  File "/usr/share/didjvu/lib/didjvu.py", line 138, in make_layer
    image, mask = subsampler(image, mask, options)
  File "/usr/share/didjvu/lib/didjvu.py", line 134, in subsample_bg
    image = image.resize(dim, 1)
RuntimeError: std::bad_alloc

This does not happen when processing each page separately, so it looks like a memory leak somewhere.

I'll investigate this further.

jwilk commented 10 years ago

I've narrowed the memory leak down to the subsample_fg function. This function in only called when you use non-standard quality options, such as --fg-subsample or --bg-subsample.

jwilk commented 10 years ago

So it's a memory leak in Gamera: https://bugs.debian.org/747548

It would be nice to work around it in didjvu, but I don't know to do it yet.

jwilk commented 10 years ago

I've implemented a work-around in d9359cf0203e2ff1a117c74b0d15620385bb713a.

jwilk commented 10 years ago

I've just released 0.2.8, which contains the work-around.