sbs20 / scanservjs

SANE scanner nodejs web ui
https://sbs20.github.io/scanservjs/
GNU General Public License v2.0
770 stars 146 forks source link

ADF scan - HP officejet - byte count error - no output file #406

Open hooker900 opened 2 years ago

hooker900 commented 2 years ago

Hi, I use a scanner HP Officejet 2620 MFP. Flatbed scanning works fine, but using ADF for scanning creates an error without an output file at the end.

  1. after scan is started, the scanner scans all pages
  2. After all pages are completely scanned, an error message appears in red on the screen and is written into daemon.log file

server.js[21456]: DEBUG (Process): /usr/bin/scanimage -d 'hpaio:/usb/Officejet_2620_series?serial=xxx' --mode 'Lineart' --source 'ADF' --resolution 200 -l 0 -t 0 -x 210 -y 297 --format 'tiff' --brightness 1000 --contrast 1000 --batch=data/temp/~tmp-scan-1-%04d.tif, undefined , {"encoding":"binary","shell":true,"maxBuffer":16384,"ignoreErrors":false}

server.js[21456]: DEBUG (Scan): Post processing: JPG | @:pipeline.high-quality server.js[21456]: DEBUG (Process): /usr/bin/convert 'data/temp/~tmp-scan-1-0001.tif' -scale 844 -background '#808080' -extent '868x1194-0-0' 'data/preview/preview.tif', undefined , {"encoding":"binary","shell":true,"maxBuffer":16384,"ignoreErrors":false} server.js[21456]: ERROR (Http): Error: /usr/bin/convert 'data/temp/~tmp-scan-1-0001.tif' -scale 844 -background '#808080' -extent '868x1194-0-0' 'data/preview/preview.tif' exited with code: 1, stderr: convert: Bogus "StripByteCounts" field, ignoring and calculating from imagelength. 'TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/985.

server.js[21456]: convert: Read error on strip 59; got 3933 bytes, expected 7659. 'TIFFFillStrip' @ error/tiff.c/TIFFErrors/606. server.js[21456]: at ChildProcess.<anonymous> (/var/www/scanservjs/server/process.js:54:18) server.js[21456]: at ChildProcess.emit (events.js:314:20) server.js[21456]: at maybeClose (internal/child_process.js:1022:16) server.js[21456]: at Socket.<anonymous> (internal/child_process.js:444:11) server.js[21456]: at Socket.emit (events.js:314:20) server.js[21456]: at Pipe.<anonymous> (net.js:676:12)

I'd appreciate some help or fixing. Thanks a lot.

P.S.: using scanservjs version V2.20.0, platform linux on raspi

sbs20 commented 2 years ago

Really odd. Looks very similar to this issue from years ago: https://alioth-lists.debian.net/pipermail/sane-devel/2015-February/033090.html

My suggestion would be to attempt to recreate what happens at the command line. Start with...

/usr/bin/scanimage -d 'hpaio:/usb/Officejet_2620_series?serial=xxx' --mode 'Lineart' --source 'ADF' --resolution 200 -l 0 -t 0 -x 210 -y 297 --format 'tiff' --brightness 1000 --contrast 1000 --batch=test-scan-1-%04d.tif

# then
/usr/bin/convert 'test-scan-1-0001.tif' -scale 844 -background '#808080' -extent '868x1194-0-0' 'test.tif

That should reproduce the issue. Assuming it does then that's a good start. If you can create the initial TIF file with benign content you could share here then it might be possible to identify a fix or workaround.

hooker900 commented 2 years ago

Thanks for your help. I did exactly what you suggested:

After scanimage command: Scanning infinity pages, incrementing by 1, numbering from 1 Scanning page 1 Scanned page 1. (scanner status = 5) Scanning page 2 scanimage: sane_start: Document feeder out of documents Batch terminated, 1 page scanned resulting in the output file "test-scan-1-0001.tif.gz"

After convert command: convert: Bogus "StripByteCounts" field, ignoring and calculating from imagelength. 'TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/985. convert: Read error on strip 59; got 3933 bytes, expected 7659. 'TIFFFillStrip' @ error/tiff.c/TIFFErrors/606. resulting in the output file "test.tif.gz"

Hmmm, does not look like an error of scanservjs but an incorrect tif file from scanimage?!?!

Question is: Since there's an output file at the end with the manual command steps approach even though there's an error message, is it possible to workaround/suppress this error in scanservjs?

Again, thanks a lot for helping!

sbs20 commented 2 years ago

Thanks for sharing. I've managed to get the same error with the files you provided. What I don't really understand is why it's happening.

I googled convert-im6.q16: Bogus "StripByteCounts" field and only got 5 results. The two main ones were:

Given that one of these results is for HP and on a Pi, I wonder if it's very specific to that. I don't know you have a way of testing on a non-pi device, but that might be interesting. Or if there are any newer HP drivers. The imagemagick thread correctly suggests that this is really a problem with whatever is creating the image - which of course isn't exactly helpful to us.

The only other thing I can think of is to ignore the errors. After all - convert does actually manage to convert the file successfully - it just moans about it.

It's a bit of a faff to do but the software just executes a bunch of command lines. So you selected the high quality jpg options which runs this: https://github.com/sbs20/scanservjs/blob/73908006ec149728c88704b28d4ca44c7802e0ec/packages/server/src/config.js#L115

We need to ignore stderr (2>/dev/null) and return a zero error code (; (exit 0)).

So if you were to change that line to:

        'convert @- -quality 92 scan-%04d.jpg 2>/dev/null; (exit 0)'

Then it will just ignore the error. It's horrible and will suppress all errors but would probably work.

In order to override the default pipelines you need to add a custom config e.g. this and change all the pipelines you want to use.

That may be enough for you - if not, then I will keep this issue open and consider a slightly less awful way of dealing with it.

hooker900 commented 2 years ago

Thanks again for your help. Actually I tried the same on an x86 MINT linux and the problem is still the same.

I tried your proposal with limited success though. I figured out that some tags in the TIFF file were wrong. Since "convert" tries to fix the situation with the bogus StripByteCounts tag and calculates it using the ImageLength tag, it doesn't help, this tag is also wrong. The result is a TIFF file with no errors but a wrong length. Looks like the HP driver creates a buggy TIFF file if you use the ADF, really too bad!

At the end I calculate both bogus tags out of the file length and other tags and correct them. Luckily the TIFF file getting out of the scanner is a non-packed single strip TIFF image. Since I couldn't make it with the tiffset tool (looks like the tool cannot change fundamental tags) I wrote an own python script for that.

I know it's not a good approach, but I implemented the script here: https://github.com/sbs20/scanservjs/blob/73908006ec149728c88704b28d4ca44c7802e0ec/packages/server/src/scanimage.js#L81-L82 I changed that to:

      const pattern = `${Config.tempDirectory}/${Constants.TEMP_FILESTEM}-${request.index}-%04d.tifx`;
      cmdBuilder.arg(`--batch=${pattern} && ls ${Config.tempDirectory}/${Constants.TEMP_FILESTEM}-${request.index}-*.tifx | ${Config.python} ./server/scanTIFFrepair.py`);

The extension of the temporary scan files I have renamed from .tif to .tifx to avoid a collision with: https://github.com/sbs20/scanservjs/blob/73908006ec149728c88704b28d4ca44c7802e0ec/packages/server/src/scan-controller.js#L73 The modification works fine for me even though I messed around a bit in the source code and I need to take care to carry over the modifications after an update of "scanservjs". Moreover it's an effort to manually change the TIFF tags. So any better solutions are welcome!

BTW: not sure if the preview usually is updated with each scanned page during an ADF scan?! Now it doesn't (any more). It does not even show the preview picture at the end of the scan. Have I broken that with the change of extension from .tif to .tifx???

Thanks a lot.

bemoons commented 2 years ago

I have exactly the same issue. Could you share the python script? Thanks alot!

xenxoblanco commented 2 years ago

Thanks again for your help. Actually I tried the same on an x86 MINT linux and the problem is still the same.

I tried your proposal with limited success though. I figured out that some tags in the TIFF file were wrong. Since "convert" tries to fix the situation with the bogus StripByteCounts tag and calculates it using the ImageLength tag, it doesn't help, this tag is also wrong. The result is a TIFF file with no errors but a wrong length. Looks like the HP driver creates a buggy TIFF file if you use the ADF, really too bad!

At the end I calculate both bogus tags out of the file length and other tags and correct them. Luckily the TIFF file getting out of the scanner is a non-packed single strip TIFF image. Since I couldn't make it with the tiffset tool (looks like the tool cannot change fundamental tags) I wrote an own python script for that.

I know it's not a good approach, but I implemented the script here:

https://github.com/sbs20/scanservjs/blob/73908006ec149728c88704b28d4ca44c7802e0ec/packages/server/src/scanimage.js#L81-L82

I changed that to:

      const pattern = `${Config.tempDirectory}/${Constants.TEMP_FILESTEM}-${request.index}-%04d.tifx`;
      cmdBuilder.arg(`--batch=${pattern} && ls ${Config.tempDirectory}/${Constants.TEMP_FILESTEM}-${request.index}-*.tifx | ${Config.python} ./server/scanTIFFrepair.py`);

The extension of the temporary scan files I have renamed from .tif to .tifx to avoid a collision with:

https://github.com/sbs20/scanservjs/blob/73908006ec149728c88704b28d4ca44c7802e0ec/packages/server/src/scan-controller.js#L73

The modification works fine for me even though I messed around a bit in the source code and I need to take care to carry over the modifications after an update of "scanservjs". Moreover it's an effort to manually change the TIFF tags. So any better solutions are welcome! BTW: not sure if the preview usually is updated with each scanned page during an ADF scan?! Now it doesn't (any more). It does not even show the preview picture at the end of the scan. Have I broken that with the change of extension from .tif to .tifx???

Thanks a lot.

I have the same bug with the bogus tags and it seems the problem is the hplip sane driver. I have just reported the bug and it could help if other people comment this bug in here

I couldn't find how to fix this bogus tags in the tiff images. But I found other workaround. I have made when the adf is use, scanimage works in jpeg format and i have add a extra step to convert the images again in tiff format and respect other post processing.

the changes are:

https://github.com/sbs20/scanservjs/blob/73908006ec149728c88704b28d4ca44c7802e0ec/packages/server/src/scan-controller.js#L63-L66

 async scan() {
    log.debug('Scanning');
    //workaround to fix adf
    if ([Constants.BATCH_AUTO, Constants.BATCH_COLLATE_STANDARD, Constants.BATCH_COLLATE_REVERSE]
      .includes(this.request.batch)) {
        //scan in jpeg format and convert in tiff
        this.request.params.format='jpeg';
        await Process.spawn(Scanimage.scan(this.request));
        let files = (await this.listFiles()).filter(f => f.extension === '.jpg');
        const stdin = files.map(f => f.name).join('\n');
        const cmd = `convert @- -format tif f-%04d.tif`;
        await Process.spawn(cmd, stdin, { cwd: Config.tempDirectory });
    } else {
        await Process.spawn(Scanimage.scan(this.request));
    }
  }

https://github.com/sbs20/scanservjs/blob/73908006ec149728c88704b28d4ca44c7802e0ec/packages/server/src/scanimage.js#L79-L83

if ([Constants.BATCH_AUTO, Constants.BATCH_COLLATE_STANDARD, Constants.BATCH_COLLATE_REVERSE]
      .includes(request.batch)) {
      //workaround adf
      var pattern
      if ( params.format === 'jpeg' ) {
           pattern = `${Config.tempDirectory}/${Constants.TEMP_FILESTEM}-${request.index}-%04d.jpg`;
      } else {
           pattern = `${Config.tempDirectory}/${Constants.TEMP_FILESTEM}-${request.index}-%04d.tif`;
      }
      cmdBuilder.arg(`--batch=${pattern}`);
    } else {
      cmdBuilder.arg(`> ${Scanimage.filename(request.index)}`);
    }

i hope it could help other peoble.

RomchikL commented 2 years ago

I have the same issue with HP LaserJet Pro M1132 (as net scanner).

But I noticed that the error occurs only if the resolution >=300. If resolution = 200, then everything is fine.

@xenxoblanco Is it enough to replace your lines in the working container (via bash)? I tried, but failed.

Pfuenzle commented 1 year ago

I got the same problem with an HP OfficeJet 6600 when using the document feeder.

I also tried the patches by @xenxoblanco (by modifying the source and building my own container), but I could not get them to work, as I still get the same error. It seems your code only checks for double sided scans? (Looks like this to me because of the constants check).

When doing a double sided scan (or after removing the checks and doing a normal scan) I get the following error:

Scanning page 1
Application transferred too few scanlines

    at ChildProcess.<anonymous> (/app/server/process.js:65:18)
    at ChildProcess.emit (node:events:513:28)
    at maybeClose (node:internal/child_process:1100:16)
    at Process.ChildProcess._handle.onexit (node:internal/child_process:304:5)

But also, the code base changed a bit since your comment, so maybe I just failed to implement it correct.

I also tried the 2>/dev/null method from sbs20, but I could not get it to work with the OCR Scanformat, as I still received the byte count error.

I tried it with the following formats: convert @- -quality 92 tmp-%d.jpg 2>/dev/null; (exit 0) && ls tmp-*.jpg

convert @- -quality 92 tmp-%d.jpg 2>/dev/null ; ls tmp-*.jpg

convert @- -quality 92 tmp-%d.jpg ; ls tmp-*.jpg

And none of it worked, although I am a bit confused, as from my limited bash understanding at least the last one should have worked, as ; should not check for return code.

Does anyone have this problem fixed or bypassed on the current version of the project?

sbs20 commented 1 year ago

I think Application transferred too few scanlines is a different issue. You may find it easier to debug by running scanimage manually, then convert manually too. If you can fix it in bash then it'll be easier in the pipeline.

Pfuenzle commented 1 year ago

I tried it out manually and it worked fine. I realized I also had to use this bypass on the preview convert command, after doing this at least the scan part worked, but convert is behaving a bit weirdly.

I used those two commands:

scanimage --device-name=hpaio:/usb/Officejet_6600 --format=tiff --batch="tmp-%d.tif" --batch --source ADF --resolution 300
convert *.tif -quality 50 tmp-%d.jpg 2>/dev/null ; (exit 0)

When scanning two Testpages, it creates two .tif file for each page, which will then be converted to two .jpeg files, which are both correct.

When doing the scan in the container (regardless of which method, PDF or JPEG), there successfully appear two .tif files in the temp directory (~tmp-000001.tif, ~tmp-000002.tif), then the convert command gets called, but there only the first page gets comverted (tmp-0001.jpg). From this single .jpg, a PDF get successfully created, but the second (and every page after) gets ignored, even though there is a completely valid .tif file in the directory.

I tried out following methods in the config.js, where each one return above problem:

convert @- -quality 90 tmp-%d.jpg 2>/dev/null ; (exit 0) && ls tmp-*.jpg 
(which is the one nearest to your default and should pipe the files into the command, right?)

convert *.tif -quality 50 tmp-%d.jpg 2>/dev/null ; (exit 0) && ls tmp-*.jpg
(Which should pick up every .tif file, but still only gets the first one)

Do you think this behavior has something to do with the error bypass used? I don't have another printer to test it on sadly, but using the command in bash works fine, while using it in your container yields this weird result, so I am a bit confused. The only things changed in the container are the config.js lines and the convert functions in API.js to make the preview not crash the process. If you think this is due to something else, I can also create a new issue.

Also, happy new year to you :)

Edit: When running the scan image command manually, it correctly says that two pages are being scanned and then exits. When using the manual batch scan in scanserv, it pulls both pages instead of only the first through the feeder, then pauses and asks me if I want to scan another page. The popup also only shows me the first page, so it seems to me that scanserv interprets the scan image result as one page, although both pages get scanned, which may be why when using @- only the first page gets piped to the command even though both files are on disk.

sbs20 commented 1 year ago

When using the manual batch scan in scanserv, it pulls both pages instead of only the first through the feeder, then pauses and asks me if I want to scan another page.

Try using batch mode of auto instead. This is one of the less satisfactory parts of the app.

Pfuenzle commented 1 year ago

I also tried using this, this is what the first part of my above post was for.

When using the auto batch mode, all x pages get fed through the document feeder and x .tif files get created, like it should be.

Then the convert command gets called, but only the first .tif (~tmp-0001.tif) file gets converted to a .jpg (tmp-0001.jpg). All other .tif files will somehow be ignored. The singe converted .jpg file will then successfully converted to a single page PDF.

So everything is working fine, except the .tif to .jpg conversion.

Here is the log when scanning 2 pages in the document feeder using the batch auto mode:

(Here i set the config.js to use the command convert *.tif -quality 100 tmp-%d.jpg 2>/dev/null ; (exit 0) && ls tmp-*.jpg to ignore the errors and pick up every .tif file, but only the first is picked up.

I also tried convert @- -quality 90 tmp-%d.jpg 2>/dev/null ; (exit 0) && ls tmp-*.jpg which results in the same behaviour and result) To make sure it is not due to the error redirect, I also tried it with convert @- -quality 90 tmp-%d.jpg ; ls tmp-*.jpg and convert @- -quality 90 tmp-%d.jpg ; ls tmp-*.jpg and got the same result.

Note: Above commands worked completely fine when ran on its on in bash

[2023-01-02T15:25:28.562Z] DEBUG (Http): request:  {
  method: 'POST',
  path: '/scan',
  body: {
    version: '2.24.0',
    params: {
      deviceId: 'hpaio:/usb/Officejet_6600?serial=CN2AN4KHFC05RN',
      resolution: 300,
      width: 210,
      height: 297,
      left: 0.5,
      top: 0,
      mode: 'Lineart',
      source: 'ADF',
      brightness: 1000,
      contrast: 1000
    },
    filters: [],
    pipeline: '@:pipeline.ocr | PDF (JPG | @:pipeline.high-quality)',
    batch: 'auto',
    index: 1
  }
}
[2023-01-02T15:25:28.586Z] DEBUG (Scan): Scanning
[2023-01-02T15:25:28.591Z] DEBUG (Scanimage): {"context":{"devices":[{"id":"hpaio:/usb/Officejet_6600?serial=CN2AN4KHFC05RN","name":"hpaio:/usb/Officejet_6600?serial=CN2AN4KHFC05RN","features":{"--mode":{"text":"--mode Lineart|Gray|Color [Lineart]","name":"--mode","default":"Lineart","parameters":"Lineart|Gray|Color","options":["Lineart","Gray","Color"]},"--resolution":{"text":"--resolution 75|100|200|300|600dpi [75]","name":"--resolution","default":75,"parameters":"75|100|200|300|600dpi","options":[75,100,200,300,600]},"--source":{"text":"--source Flatbed|ADF [Flatbed]","name":"--source","default":"Flatbed","parameters":"Flatbed|ADF","options":["Flatbed","ADF"]},"--brightness":{"text":"--brightness 0..2000 [1000]","name":"--brightness","default":1000,"parameters":"0..2000","limits":[0,2000],"interval":1},"--contrast":{"text":"--contrast 0..2000 [1000]","name":"--contrast","default":1000,"parameters":"0..2000","limits":[0,2000],"interval":1},"--compression":{"text":"--compression None|JPEG [None]","name":"--compression","default":"None","parameters":"None|JPEG"},"-l":{"text":"-l 0..215.9mm [0]","name":"-l","default":0,"parameters":"0..215.9mm","limits":[0,215.9],"interval":1},"-t":{"text":"-t 0..297.011mm [0]","name":"-t","default":0,"parameters":"0..297.011mm","limits":[0,297],"interval":1},"-x":{"text":"-x 0..215.9mm [215.9]","name":"-x","default":215.9,"parameters":"0..215.9mm","limits":[0,215.9],"interval":1},"-y":{"text":"-y 0..297.011mm [297.011]","name":"-y","default":297,"parameters":"0..297.011mm","limits":[0,297],"interval":1}},"string":"\nAll options specific to device `hpaio:/usb/Officejet_6600?serial=CN2AN4KHFC05RN':\n  Scan mode:\n    --mode Lineart|Gray|Color [Lineart]\n        Selects the scan mode (e.g., lineart, monochrome, or color).\n    --resolution 75|100|200|300|600dpi [75]\n        Sets the resolution of the scanned image.\n    --source Flatbed|ADF [Flatbed]\n        Selects the scan source (such as a document-feeder).\n  Advanced:\n    --brightness 0..2000 [1000]\n        Controls the brightness of the acquired image.\n    --contrast 0..2000 [1000]\n        Controls the contrast of the acquired image.\n    --compression None|JPEG [None]\n        Selects the scanner compression method for faster scans, possibly at\n        the expense of image quality.\n    --jpeg-quality 0..100 [inactive]\n        Sets the scanner JPEG compression factor. Larger numbers mean better\n        compression, and smaller numbers mean better image quality.\n  Geometry:\n    -l 0..215.9mm [0]\n        Top-left x position of scan area.\n    -t 0..297.011mm [0]\n        Top-left y position of scan area.\n    -x 0..215.9mm [215.9]\n        Width of scan-area.\n    -y 0..297.011mm [297.011]\n        Height of scan-area.\n\n"},{"id":"hpaio:/net/officejet_6600?ip=192.168.178.104&queue=false","name":"hpaio:/net/officejet_6600?ip=192.168.178.104&queue=false","features":{"--mode":{"text":"--mode Lineart|Gray|Color [Lineart]","name":"--mode","default":"Lineart","parameters":"Lineart|Gray|Color","options":["Lineart","Gray","Color"]},"--resolution":{"text":"--resolution 75|100|200|300|600dpi [75]","name":"--resolution","default":75,"parameters":"75|100|200|300|600dpi","options":[75,100,200,300,600]},"--source":{"text":"--source Flatbed|ADF [Flatbed]","name":"--source","default":"Flatbed","parameters":"Flatbed|ADF","options":["Flatbed","ADF"]},"--brightness":{"text":"--brightness 0..2000 [1000]","name":"--brightness","default":1000,"parameters":"0..2000","limits":[0,2000],"interval":1},"--contrast":{"text":"--contrast 0..2000 [1000]","name":"--contrast","default":1000,"parameters":"0..2000","limits":[0,2000],"interval":1},"--compression":{"text":"--compression None|JPEG [None]","name":"--compression","default":"None","parameters":"None|JPEG"},"-l":{"text":"-l 0..215.9mm [0]","name":"-l","default":0,"parameters":"0..215.9mm","limits":[0,215.9],"interval":1},"-t":{"text":"-t 0..297.011mm [0]","name":"-t","default":0,"parameters":"0..297.011mm","limits":[0,297],"interval":1},"-x":{"text":"-x 0..215.9mm [215.9]","name":"-x","default":215.9,"parameters":"0..215.9mm","limits":[0,215.9],"interval":1},"-y":{"text":"-y 0..297.011mm [297.011]","name":"-y","default":297,"parameters":"0..297.011mm","limits":[0,297],"interval":1}},"string":"\nAll options specific to device `hpaio:/net/officejet_6600?ip=192.168.178.104&queue=false':\n  Scan mode:\n    --mode Lineart|Gray|Color [Lineart]\n        Selects the scan mode (e.g., lineart, monochrome, or color).\n    --resolution 75|100|200|300|600dpi [75]\n        Sets the resolution of the scanned image.\n    --source Flatbed|ADF [Flatbed]\n        Selects the scan source (such as a document-feeder).\n  Advanced:\n    --brightness 0..2000 [1000]\n        Controls the brightness of the acquired image.\n    --contrast 0..2000 [1000]\n        Controls the contrast of the acquired image.\n    --compression None|JPEG [None]\n        Selects the scanner compression method for faster scans, possibly at\n        the expense of image quality.\n    --jpeg-quality 0..100 [inactive]\n        Sets the scanner JPEG compression factor. Larger numbers mean better\n        compression, and smaller numbers mean better image quality.\n  Geometry:\n    -l 0..215.9mm [0]\n        Top-left x position of scan area.\n    -t 0..297.011mm [0]\n        Top-left y position of scan area.\n    -x 0..215.9mm [215.9]\n        Width of scan-area.\n    -y 0..297.011mm [297.011]\n        Height of scan-area.\n\n"},{"id":"escl:https://192.168.178.192:443","name":"escl:https://192.168.178.192:443","features":{"--mode":{"text":"--mode Lineart|Gray|Color [Lineart]","name":"--mode","default":"Lineart","parameters":"Lineart|Gray|Color","options":["Lineart","Gray","Color"]},"--resolution":{"text":"--resolution 100|200|300|600|1200dpi [100]","name":"--resolution","default":100,"parameters":"100|200|300|600|1200dpi","options":[100,200,300,600,1200]},"--preview":{"text":"--preview[=(yes|no)] [no]","name":"--preview","default":"no","parameters":"[=(yes|no)]"},"--preview-in-gray":{"text":"--preview-in-gray[=(yes|no)] [no]","name":"--preview-in-gray","default":"no","parameters":"[=(yes|no)]"},"-l":{"text":"-l 0..207.772mm [0]","name":"-l","default":0,"parameters":"0..207.772mm","limits":[0,207.7],"interval":1},"-t":{"text":"-t 0..289.052mm [0]","name":"-t","default":0,"parameters":"0..289.052mm","limits":[0,289],"interval":1},"-x":{"text":"-x 8.12799..215.9mm [215.9]","name":"-x","default":215.9,"parameters":"8.12799..215.9mm","limits":[8.1,215.9],"interval":1},"-y":{"text":"-y 8.12799..297.18mm [297.18]","name":"-y","default":297.1,"parameters":"8.12799..297.18mm","limits":[8.1,297.1],"interval":1},"--source":{"text":"--source Flatbed [Flatbed]","name":"--source","default":"Flatbed","parameters":"Flatbed","options":["Flatbed"]}},"string":"\nAll options specific to device `escl:https://192.168.178.192:443':\n  Scan mode:\n    --mode Lineart|Gray|Color [Lineart]\n        Selects the scan mode (e.g., lineart, monochrome, or color).\n    --resolution 100|200|300|600|1200dpi [100]\n        Sets the resolution of the scanned image.\n    --preview[=(yes|no)] [no]\n        Request a preview-quality scan.\n    --preview-in-gray[=(yes|no)] [no]\n        Request that all previews are done in monochrome mode.  On a\n        three-pass scanner this cuts down the number of passes to one and on a\n        one-pass scanner, it reduces the memory requirements and scan-time of\n        the preview.\n  Geometry:\n    -l 0..207.772mm [0]\n        Top-left x position of scan area.\n    -t 0..289.052mm [0]\n        Top-left y position of scan area.\n    -x 8.12799..215.9mm [215.9]\n        Width of scan-area.\n    -y 8.12799..297.18mm [297.18]\n        Height of scan-area.\n    --source Flatbed [Flatbed]\n        Selects the scan source (such as a document-feeder).\n\n"},{"id":"airscan:e0:EPSON ET-2850 Series","name":"airscan:e0:EPSON ET-2850 Series","features":{"--resolution":{"text":"--resolution 100|200|300|600|1200dpi [300]","name":"--resolution","default":300,"parameters":"100|200|300|600|1200dpi","options":[100,200,300,600,1200]},"--mode":{"text":"--mode Color|Gray [Color]","name":"--mode","default":"Color","parameters":"Color|Gray","options":["Color","Gray"]},"--source":{"text":"--source Flatbed [Flatbed]","name":"--source","default":"Flatbed","parameters":"Flatbed","options":["Flatbed"]},"-l":{"text":"-l 0..215.9mm [0]","name":"-l","default":0,"parameters":"0..215.9mm","limits":[0,215.9],"interval":1},"-t":{"text":"-t 0..297.18mm [0]","name":"-t","default":0,"parameters":"0..297.18mm","limits":[0,297.1],"interval":1},"-x":{"text":"-x 0..215.9mm [215.9]","name":"-x","default":215.9,"parameters":"0..215.9mm","limits":[0,215.9],"interval":1},"-y":{"text":"-y 0..297.18mm [297.18]","name":"-y","default":297.1,"parameters":"0..297.18mm","limits":[0,297.1],"interval":1},"--brightness":{"text":"--brightness -100..100% (in steps of 1) [0]","name":"--brightness","default":0,"parameters":"-100..100% (in steps of 1)","limits":[-100,100],"interval":1},"--contrast":{"text":"--contrast -100..100% (in steps of 1) [0]","name":"--contrast","default":0,"parameters":"-100..100% (in steps of 1)","limits":[-100,100],"interval":1},"--shadow":{"text":"--shadow 0..100% (in steps of 1) [0]","name":"--shadow","default":"0","parameters":"0..100% (in steps of 1)"},"--highlight":{"text":"--highlight 0..100% (in steps of 1) [100]","name":"--highlight","default":"100","parameters":"0..100% (in steps of 1)"},"--analog-gamma":{"text":"--analog-gamma 0.0999908..4 [1]","name":"--analog-gamma","default":"1","parameters":"0.0999908..4"},"--negative":{"text":"--negative[=(yes|no)] [no]","name":"--negative","default":"no","parameters":"[=(yes|no)]"}},"string":"\nAll options specific to device `airscan:e0:EPSON ET-2850 Series':\n  Standard:\n    --resolution 100|200|300|600|1200dpi [300]\n        Sets the resolution of the scanned image.\n    --mode Color|Gray [Color]\n        Selects the scan mode (e.g., lineart, monochrome, or color).\n    --source Flatbed [Flatbed]\n        Selects the scan source (such as a document-feeder).\n  Geometry:\n    -l 0..215.9mm [0]\n        Top-left x position of scan area.\n    -t 0..297.18mm [0]\n        Top-left y position of scan area.\n    -x 0..215.9mm [215.9]\n        Width of scan-area.\n    -y 0..297.18mm [297.18]\n        Height of scan-area.\n  Enhancement:\n    --brightness -100..100% (in steps of 1) [0]\n        Controls the brightness of the acquired image.\n    --contrast -100..100% (in steps of 1) [0]\n        Controls the contrast of the acquired image.\n    --shadow 0..100% (in steps of 1) [0]\n        Selects what radiance level should be considered \"black\".\n    --highlight 0..100% (in steps of 1) [100]\n        Selects what radiance level should be considered \"white\".\n    --analog-gamma 0.0999908..4 [1]\n        Analog gamma-correction\n    --negative[=(yes|no)] [no]\n        Swap black and white\n\n"}],"version":"2.24.0","diagnostics":[{"success":true,"message":"Found /usr/bin/scanimage"},{"success":true,"message":"Found /usr/bin/convert"}],"pipelines":[{"extension":"jpg","description":"JPG | @:pipeline.high-quality","commands":["convert @- -quality 100 scan-%04d.jpg 2>/dev/null ; (exit 0)","ls scan-*.*"]},{"extension":"jpg","description":"JPG | @:pipeline.medium-quality","commands":["convert @- -quality 75 scan-%04d.jpg 2>/dev/null ; (exit 0)","ls scan-*.*"]},{"extension":"jpg","description":"JPG | @:pipeline.low-quality","commands":["convert @- -quality 50 scan-%04d.jpg 2>/dev/null ; (exit 0)","ls scan-*.*"]},{"extension":"png","description":"PNG","commands":["convert @- -quality 100 scan-%04d.png 2>/dev/null ; (exit 0)","ls scan-*.*"]},{"extension":"tif","description":"TIF | @:pipeline.uncompressed","commands":["convert @- scan-0000.tif 2>/dev/null ; (exit 0)","ls scan-*.*"]},{"extension":"tif","description":"TIF | @:pipeline.lzw-compressed","commands":["convert @- -compress lzw scan-0000.tif 2>/dev/null ; (exit 0)","ls scan-*.*"]},{"extension":"pdf","description":"PDF (TIF | @:pipeline.uncompressed)","commands":["convert @- scan-0000.pdf 2>/dev/null ; (exit 0)","ls scan-*.*"]},{"extension":"pdf","description":"PDF (TIF | @:pipeline.lzw-compressed)","commands":["convert @- -compress lzw tmp-%04d.tif 2>/dev/null ; (exit 0) && ls tmp-*.tif","convert @- scan-0000.pdf 2>/dev/null ; (exit 0)","ls scan-*.*"]},{"extension":"pdf","description":"PDF (JPG | @:pipeline.high-quality)","commands":["convert *.tif -quality 100 tmp.jpg ; ls tmp-*.jpg","convert tmp*.jpg scan-0000.pdf","ls scan-*.*"]},{"extension":"pdf","description":"PDF (JPG | @:pipeline.medium-quality)","commands":["convert *.tif -quality 75 tmp-%d.jpg 2>/dev/null ; (exit 0) ; ls tmp-*.jpg","convert @- scan-0000.pdf","ls scan-*.*"]},{"extension":"pdf","description":"PDF (JPG | @:pipeline.low-quality)","commands":["convert *.tif -quality 50 tmp.jpg 2>/dev/null ; (exit 0) && ls tmp-*.jpg","convert tmp*.jpg scan-0000.pdf","ls scan-*.*"]},{"extension":"pdf","description":"@:pipeline.ocr | PDF (JPG | @:pipeline.high-quality)","commands":["convert *.tif -quality 100 tmp-%d.jpg 2>/dev/null ; (exit 0) && ls tmp-*.jpg","/usr/bin/tesseract -l deu -c stream_filelist=true - - pdf > scan-0001.pdf","ls scan-*.*"]},{"extension":"txt","description":"@:pipeline.ocr | @:pipeline.text-file","commands":["/usr/bin/tesseract -l deu -c stream_filelist=true - - txt > scan-0001.txt","ls scan-*.*"]}],"filters":[{"description":"filter.auto-level","params":"-auto-level"},{"description":"filter.threshold","params":"-channel RGB -threshold 80%"},{"description":"filter.blur","params":"-blur 1"}],"paperSizes":[{"name":"A3 (@:paper-size.portrait)","dimensions":{"x":297,"y":420}},{"name":"A4 (@:paper-size.portrait)","dimensions":{"x":210,"y":297}},{"name":"A5 (@:paper-size.portrait)","dimensions":{"x":148,"y":210}},{"name":"A5 (@:paper-size.landscape)","dimensions":{"x":210,"y":148}},{"name":"A6 (@:paper-size.portrait)","dimensions":{"x":105,"y":148}},{"name":"A6 (@:paper-size.landscape)","dimensions":{"x":148,"y":105}},{"name":"B3 (@:paper-size.portrait)","dimensions":{"x":353,"y":500}},{"name":"B4 (@:paper-size.portrait)","dimensions":{"x":250,"y":353}},{"name":"B5 (@:paper-size.portrait)","dimensions":{"x":176,"y":250}},{"name":"B5 (@:paper-size.landscape)","dimensions":{"x":250,"y":176}},{"name":"B6 (@:paper-size.portrait)","dimensions":{"x":125,"y":176}},{"name":"B6 (@:paper-size.landscape)","dimensions":{"x":176,"y":125}},{"name":"DIN D3 (@:paper-size.portrait)","dimensions":{"x":272,"y":385}},{"name":"DIN D4 (@:paper-size.portrait)","dimensions":{"x":192,"y":272}},{"name":"DIN D5 (@:paper-size.portrait)","dimensions":{"x":136,"y":192}},{"name":"DIN D5 (@:paper-size.landscape)","dimensions":{"x":192,"y":136}},{"name":"DIN D6 (@:paper-size.portrait)","dimensions":{"x":96,"y":136}},{"name":"DIN D6 (@:paper-size.landscape)","dimensions":{"x":136,"y":96}},{"name":"@:paper-size.letter (@:paper-size.portrait)","dimensions":{"x":216,"y":279}},{"name":"@:paper-size.legal (@:paper-size.portrait)","dimensions":{"x":216,"y":356}},{"name":"@:paper-size.tabloid (@:paper-size.portrait)","dimensions":{"x":279,"y":432}},{"name":"@:paper-size.ledger (@:paper-size.portrait)","dimensions":{"x":432,"y":279}},{"name":"@:paper-size.junior-legal (@:paper-size.portrait)","dimensions":{"x":127,"y":203}},{"name":"@:paper-size.half-letter (@:paper-size.portrait)","dimensions":{"x":140,"y":216}}],"batchModes":["none","manual","auto","auto-collate-standard"]},"params":{"deviceId":"hpaio:/usb/Officejet_6600?serial=CN2AN4KHFC05RN","resolution":300,"format":"tiff","isPreview":false,"top":0,"left":0.5,"width":210,"height":297,"mode":"Lineart","source":"ADF","brightness":1000},"filters":[],"pipeline":"@:pipeline.ocr | PDF (JPG | @:pipeline.high-quality)","batch":"auto","index":1}
[2023-01-02T15:25:28.594Z] DEBUG (Process): /usr/bin/scanimage -d 'hpaio:/usb/Officejet_6600?serial=CN2AN4KHFC05RN' --mode 'Lineart' --source 'ADF' --resolution 300 -l 0.5 -t 0 -x 210 -y 297 --format 'tiff' --brightness 1000 --batch=data/temp/~tmp-scan-1-%04d.tif,  undefined , {"encoding":"binary","shell":true,"maxBuffer":16384,"ignoreErrors":false}
[2023-01-02T15:26:19.892Z] DEBUG (Scan): Post processing: @:pipeline.ocr | PDF (JPG | @:pipeline.high-quality)
[2023-01-02T15:26:20.284Z] DEBUG (Process): /usr/bin/convert 'data/temp/~tmp-scan-1-0001.tif' -scale 844 -background '#808080' -extent '868x1194-2-0' 'data/preview/preview.tif' 2>/dev/null ; (exit 0),  undefined , {"encoding":"binary","shell":true,"maxBuffer":16384,"ignoreErrors":false}
[2023-01-02T15:26:23.267Z] DEBUG (Scan): Executing cmds: [
  'convert *.tif -quality 100 tmp-%d.jpg 2>/dev/null ; (exit 0) && ls tmp-*.jpg',
  '/usr/bin/tesseract -l deu -c stream_filelist=true - - pdf > scan-0001.pdf',
  'ls scan-*.*'
]
[2023-01-02T15:26:23.271Z] DEBUG (Process): convert *.tif -quality 100 tmp-NaN.jpg 2>/dev/null ; (exit 0) && ls tmp-*.jpg,  , {"encoding":"binary","shell":true,"maxBuffer":16384,"ignoreErrors":false,"cwd":"data/temp"}
[2023-01-02T15:26:25.284Z] DEBUG (Process): /usr/bin/tesseract -l deu -c stream_filelist=true - - pdf > scan-0001.pdf,  <Buffer 74 6d 70 2d 30 2e 6a 70 67 0a> , {"encoding":"binary","shell":true,"maxBuffer":16384,"ignoreErrors":false,"cwd":"data/temp"}
[2023-01-02T15:27:18.538Z] DEBUG (Process): ls scan-*.*,  <Buffer > , {"encoding":"binary","shell":true,"maxBuffer":16384,"ignoreErrors":false,"cwd":"data/temp"}
[2023-01-02T15:27:18.696Z] DEBUG (Scan): Written data to: data/output/scan_2023-01-02 15.27.18.pdf
Pfuenzle commented 1 year ago

Alright, I tested the stdin of scanserv with some debug options and turns out, both files correctly get sent there.

When running convert @- -quality 90 tmp-%d.jpg 2>/dev/null ; ls tmp-*.jpg on my host and piping 2 filenames there, both files will be converted.

When running the same in the container, only the first file gets converted, so it seems to have something to do with ImageMagick.

Host version: ImageMagick 6.9.10-23 Q16 arm 20190101 Container version: ImageMagick 6.9.11-60 Q16 arm 2021-01-25

sbs20 commented 1 year ago

When running the same in the container, only the first file gets converted, so it seems to have something to do with ImageMagick.

This is annoying!! Thank you for digging deeper.

The real problem here is, I assume, the HP drivers producing bad TIF files. I wonder, does it also create bad pnm files?

scanimage --device-name=hpaio:/usb/Officejet_6600 --format=pnm --batch="tmp-%d.pnm" --batch --source ADF --resolution 300
convert *.pnm -quality 50 tmp-%d.jpg #2>/dev/null ; (exit 0)
Pfuenzle commented 1 year ago

I made a docker image yesterday that built ImageMagick 7 from source, but the same error persisted, so I just decided to run it bare metal, and now all pages get converted into a PDF.

For the PNM files, they also seem to be somehow corrupted.

Here is the command output:

pi@raspberrypi:~/tmp $ sudo scanimage --device-name=hpaio:/usb/Officejet_6600?serial=CN2AN4KHFC05RN --format=pnm --batch="tmp-%d.pnm" --batch --source ADF --resolution 300
Scanning infinity pages, incrementing by 1, numbering from 1
Scanning page 1
Scanned page 1. (scanner status = 5)
Scanning page 2
Scanned page 2. (scanner status = 5)
Scanning page 3
scanimage: sane_start: Document feeder out of documents
Batch terminated, 2 pages scanned
pi@raspberrypi:~/tmp $ ls
out1.pnm  out2.pnm
pi@raspberrypi:~/tmp $ convert *.pnm -quality 50 tmp-%d.jpg
convert-im6.q16: insufficient image data in file `out1.pnm' @ error/pnm.c/ReadPNMImage/443.
convert-im6.q16: insufficient image data in file `out2.pnm' @ error/pnm.c/ReadPNMImage/443.
convert-im6.q16: no images defined `tmp-%d.jpg' @ error/convert.c/ConvertImageCommand/3258.

Somehow there is no image data, but the files itself are about 1-2 MB big and have data in them. But when trying to convert them using an online converter, they fail to convert the files.

When using the .tif files, the .jpg files are created and valid (even though an error is thrown) but when using PNM, no .jpg are created at all, so the driver is probably even worse for PNM than .tif

sbs20 commented 1 year ago

Grr. I'm trying to think about other options. Does it support png or jpg as an output format from scanimage?

xenxoblanco commented 1 year ago

@Pfuenzle As far i could check in my case the problem was only with double side scan. That's the reason why i made this logic. First it uses the scanimage command with --format jpg (the only format i checked that the hp driver produces well-formed image files). And then i appended and aditional step to convert this jpg files to tiff images.

The reason of this additional step was the main image format used in the rest of code is tiff. An odd Job but it works for me.

I hope you find my comment useful

DJfighter commented 10 months ago

Unfortunately the issue is still present with M1132 and Rpi 02W. Lineart and grey works with 75-200dpi, color only works with 75 and 150, batch setting doesn't matter. The temp.tif file is created every time, but with other settings the conversion stops with read error on strip xy (height). I copied the tif to my PC and I can open it with GIMP. I tried to implement the posted changes from this thread, but the master code has changed quite a bit since the posts and none of the changes have solved the problem, and sometimes prevented the service from starting.

mauveferret commented 7 months ago

@DJfighter, I encountered a similar problem described in the Issue for HP M1132 usb printer and Orange Pi Zero 2W with arm64 Debian 12: the bogus StripByteCounts tag of the TIFF file, which produces error during convert procedure. So as was mentioned above the problem is in scanimage (or HP drivers), not in scanservjs. Unfortunately, I was not able to apply the tips suggested above. However, it led me to a rather barbaric approach, which nevertheless worked, so I can get scans in any formats with any presets using HP M1132 & scanservjs. Below I describe my actions, which actually requires editing the JS sources:

  1. sudo nano /usr/lib/scanservjs/server/classes/command-builder.js
    82    build(ignoreStderr) {
    83    log.trace('build()', this);
    84    let cmd = this.cmd;
    85    for (const arg of this.args) {
    86      cmd += ' ' + arg;
    87    }
    88    ignoreStderr = true;
    89    if (true) {
    90      cmd += ' 2>/dev/null; (exit 0)';
    91    }
    92    log.trace('build()', cmd);
    93    return cmd;
    94   }

Here I changed ignoreStderr value to be True and added ; (exit 0) to 2>/dev/null

  1. sudo nano /usr/lib/scanservjs/server/scan-controller.js
84    // Apply filters
85    if (this.request.filters.length > 0) {
86     const stdin = files.map(f => `${f.name}\n`).join('');
87     const cmd = `convert@-${application.filterBuilder().build(this.request.filters)} f-%04d.tif 2>/dev/null; (exit 0)`;      
88     await Process.spawn(cmd, stdin, { cwd: config.tempDirectory });
89      files = (await this.listFiles()).filter(f => f.name.match(/f-\d{4}\.tif/));
90    }

Here I added 2>/dev/null; (exit 0) to cmd, which is not processed by build()

  1. sudo nano /usr/lib/scanservjs/server/classes/config.js
          {
            extension: 'jpg',
            description: 'JPG | @:pipeline.high-quality',
            commands: [
              'while read filename ; do convert -quality 92 $filename scan-$(date +%s.%N).jpg 2>/dev/null; (exit 0); done',
              'ls scan-*.*'
            ]
          },

Here you should edit command of every pipeline, that you plan to use in the similar manner as previous: by adding 2>/dev/null; (exit 0).

  1. Finally, do sudo systemctl restart scanservjs.service.

This solution is suggested by @sbs20 in the first reply on this issue. However, in my case creating the custom pipeline did not work probably because the convert procedure is performed several times (for preview.tiff and for output dir). The way which is proposed here is nasty, as removes all error messages, but it works with HP M1132 and Raspbian like single board computer. At the same time, you must remember that when updating the scanservjs, these procedures will need to be repeated.

it seems to me that much more elegant option would be to force the scanservjs to accept not only tiff from scanmiage but also png files which, as my experience has shown, can be generated by scanimage&hplip without errors and then successfully converted to other formats (pdf and jpeg files are also corrupted). However, being a newbie in JS, I didn't deal with it. Probably this requires some major changes in the source code. Anyway, I hope my recommendations will help someone!