jwilk / didjvu

DjVu encoder with foreground/background separation
https://jwilk.net/software/didjvu
GNU General Public License v2.0
10 stars 8 forks source link

Background picture contains a lot of fuzz behind foreground mask #18

Open rmast opened 3 years ago

rmast commented 3 years ago

One of the tricks used in the commercial DjVu to keep it small is to involve the foreground mask to code invisible parts in the background picture behind the foreground mask with the least memory hogging filler.

However when I encode a text and a logo that mostly end up in the foreground mask I see a lot of fuzz in the background picture behind or around the text that won't be missing at all, but take more space than needed. I can leave out the background by selecting foreground in Djview4 and don't miss any relevant details.

Reproduction steps just the same as in https://github.com/jwilk/didjvu/issues/16: (ignore the issue in the bottom line of the picture):

When I encode the included Scantailored image with this statement in the standard didjvu (master from jwilk) with Python2.7 and http://nl.archive.ubuntu.com/ubuntu/pool/universe/g/gamera/python-gamera_3.4.2+git20160808.1725654-2_amd64.deb in Mint 20.2

./didjvu encode ~/scantailorin/out/outputbase2-000-raar-effect-onderste-regel-didjvu\ zonder\ tekst.tif -o jaarverslagraar.djvu outputbase2-000-raar-effect-onderste-regel-didjvu zonder tekst.zip

Result jaarverslagraar.zip

jsbien commented 3 years ago

If I understand correctly, commercial separation of background and foreground was covered by some patents. I've done a check some time ago and it seems all the patents expired except one or two not relevant to this problem. So you can look up the patents and see how it is done. Unfortunately the explicit list of patents is no more available. I can tell more about it if anybody is interested.

rmast commented 3 years ago

Unfortunately the explicit list of patents is no more available. I can tell more about it if anybody is interested.

Doesn't the internet archive/wayback machine have a copy? Wouldn't it be handy to collect this knowledge somewhere?

jsbien commented 3 years ago

Please have a look at https://sourceforge.net/p/djvu/discussion/103285/thread/381189861c/. I will post my notes and the list there in a few days. The list used to be at http://djvu.org/forum/phpbb/viewtopic.php?t=727. The right place for it is, in my opinion, the Wikipedia DjVu entry.

rmast commented 3 years ago

This was the content of that page on 20101122094547, via http://web.archive.org/web/20101122094547/http://djvu.org/forum/phpbb/viewtopic.php?t=727

erd Joined: 17 Sep 2008Posts: 28 | Posted: Tue Dec 08, 2009 7:27 pm    Post subject: Patents

Patent number: 5900953 Title: Method and apparatus for extracting a foreground image and a background image from a color document image Inventors: Bottou; Leon (Highlands, NJ), LeCun; Yann Andre (Lincroft, NJ) Assignee: AT&T Corp (Middletown, NJ) Filing date: Jun 17, 1997 Issue date: May 4, 1999 Google patents: http://www.google.com/patents/about?id=j2sWAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F5900953

Patent number: 6058214 Title: Compression of partially masked still images Inventors: Bottou; Leon (Monmouth, NJ), Pigeon; Steven (Blainville, CA) Assignee: AT&T Corp. (New York, NY) Filing date: Jan 19, 1999 Issue date: May 2, 2000 Google patents: http://www.google.com/patents/about?id=5j8EAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6058214

Patent number: 6188334 Title: Z-coder: fast adaptive binary arithmetic coder Inventors: Bengio; Yoshua (Montreal, CA), Bottou; Leon (Highlands, NJ), Howard; Paul G. (Morganville, NJ) Assignee: AT&T Corp. (New York, NY) Filing date: May 5, 2000 Issue date: Feb 13, 2001 Google patents: http://www.google.com/patents/about?id=PuEGAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6188334

Patent number: 6281817 Title: Z-coder: a fast adaptive binary arithmetic coder Inventors: Bengio; Yoshua (Montreal, CA), Bottou; Leon (Highlands, NJ), Howard; Paul G. (Morganville, NJ) Assignee: AT&T Corp. (New York, NY) Filing date: Feb 28, 2001 Issue date: Aug 28, 2001 Google patents: http://www.google.com/patents/about?id=aD4IAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6281817

Patent number: 6343154 Title: Compression of partially-masked image data Inventors: Bottou; Leon (Monmouth, NJ), Pigeon; Steven (Blainville, CA) Assignee: AT&T Corp. (New York, NY) Filing date: Mar 24, 2000 Issue date: Jan 29, 2002 Google patents: http://www.google.com/patents/about?id=Ox0LAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6343154

Patent number: 6476740 Title: Z-coder: a fast adaptive binary arithmetic coder Inventors: Bengio; Yoshua (Montreal, CA), Bottou; Leon (Highlands, NJ), Howard; Paul G. (Morganville, NJ) Assignee: AT&T Corp. (New York, NY) Filing date: December 14, 2001 Issue date: November 5, 2002 Google patents: http://www.google.com/patents/about?id=A0ELAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6476740

Patent number: 6587588 Title: Progressive image decoder for wavelet encoded images in compressed files and method of operation Inventors: Bottou; Leon (Highlands, NJ), Howard; Paul Glor (Morganville, NJ) Assignee: AT&T Corp. (New York, NY) Filing date: Dec 16, 1999 Issue date: Jul 1, 2003 Google patents: http://www.google.com/patents/about?id=ANQOAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6587588

Patent number: 6728411 Title: Compression of partially-masked image data Inventors: Bottou; Leon (Monmouth, NJ), Pigeon; Steven (Blainville, CA) Assignee: AT&T Corp. (New York, NY) Filing date: Sep 7, 2001 Issue date: Apr 27, 2004 Google patents: http://www.google.com/patents/about?id=tEASAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6728411

Patent number: 6901169 Title: Method and system for classifying image elements Inventors: Bottou; Leon (Highlands, NJ), Haffner; Patrick Guy (Atlantic Highlands, NJ) Assignee: AT & T Corp. (New York, NY) Filing date: Jan 24, 2002 Issue date: May 31, 2005 Google patents: http://www.google.com/patents/about?id=2NcVAAAAEBAJ USPTO: http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN%2F6901169

jsbien commented 3 years ago

I've checked and updated URLs, some were obsolete.

rmast commented 3 years ago

New Google-links to those patents are like: https://patents.google.com/patent/US5900953

rmast commented 3 years ago

I've checked and updated URLs, some were obsolete.

Where did you update them?

jsbien commented 3 years ago

In my note, which I'm unable to find quickly (or at all...). Unfortunately recoll has some problems on my computer. But I still try. In the meantime I've uploaded (for some time only) the patent documents to https://drive.google.com/drive/folders/0ByUlY1K26_lrU2lBal9OZDA1Tlk?resourcekey=0-_Y33s4xxg5_YCmGAIgLehw&usp=sharing.

rmast commented 3 years ago

I saw this issue directly comes from the djvulibre djvumake, which is called with an sjbz mask and a PPM which is divided by the mask in the foreground and the background. So this issue should be propagated to djvumake or worked around. Would c44 with a mask perform just as poor as djvumake?

You would think to first subtract the foreground image with the surroundings of partial pixels and fill their gaps with vectors of the remaining surroundings. I wonder what the mentioned patents say about this, so I'll look into it. Would the commercial djvumake or c44 perform better?

jwilk commented 3 years ago

Would c44 with a mask perform just as poor as djvumake?

You can try this yourself: if you use any of the --fg-… or --bg-… options, didjvu will use c44 for IW44 encoding (because djvumake don't have options for that).

rmast commented 3 years ago

You can try this yourself: if you use any of the --fg-… or --bg-… options, didjvu will use c44 for IW44 encoding

@jwilk Thanks! --fg-slices 100 (the default, but the routine via c44) gives a visually much cleaner background, almost white, but still containing some smudge and a big background picture with too many quality iterations. --bg-slices 0 even shrinks the background image that contains no useful content to some minimalistic black and white image. With such a visually clean background image it would probably be possible to have a routine judge the lack of content to just choose this minimalistic background.