klep / scanline

Command line scanning utility for OSX
MIT License
126 stars 24 forks source link

Scan fails with -mono #46

Open ed2050 opened 2 weeks ago

ed2050 commented 2 weeks ago

Hi, this tool is exactly what I need to scan loads of old papers. Thank you for making this. ๐Ÿ™ ๐Ÿ˜„ At the moment it doesn't work quite right though...

Installed latest scanline binary from pkg. It finds the scanner ok:

> scanline -list
Available scanners:
* HP Officejet 4630 series [6A18B8]

Scanning seems to mostly work. If I do scanline -resolution 200 -verbose then all is fine. It scans from doc feeder and saves to ~/Documents/Archives.

However sometimes there are errors. Notably:

  1. The -mono flag gives errors. Verbose output shows it finding the device, but something fails when it tries to scan:
    > scanline -scanner 'HP Officejet 4630 series [6A18B8]' -mono -resolution 200 -verbose
    ... lots of device messages ...
    Configuring scanner
    Configuring Document Feeder
    Starting scan...
    didEncounterError: An error occurred during scanning.
    Failed to scan document.
    Done

    And that's it. No docs are created in ~/Documents/Archive from above command.

Scanner is HP OfficeJet 4630 accessed on local wifi network. Two different macs recognize the scanner. Scanning from document feeder works fine with Image Capture app (b&w or color).

Any idea what's going on? Color files are twice as large as b&w, I'd really like to get it working.

  1. Another minor issue - Running scanline without any docs in the feeder produces a blank 1-page file in Archives. I discovered this accidentally when trying to display help with scanline -h. As far as I can tell, the OfficeJet scanner never makes any noise and is not accessed. scanline just creates an empty file. Wouldn't it be preferable to not to create any files in this case?

Complete output

> scanline -scanner 'HP Officejet 4630 series [6A18B8]' -mono -resolution 200 -verbose
Browsing for scanners.
Waiting up to 10.0 seconds to find scanners
Added device: 
ICScannerDevice <0x7fa70ee11a60>:
  delegate                      : <0x0>
  deviceRef                     : 0x10000125
  connectionID                  : 0x00000000
  deviceID                      : 0x00000000
  name                          : HP Officejet 4630 series [6A18B8]
  locationDescription           : 6A18B8
  iconPath                      : /System/Library/Image Capture/Devices/AirScanScanner.app/Contents/Resources/GenericAirScanScanner.icns
  softwareInstallPercentDone    : 100.000000
  modulePath                    : /System/Library/Image Capture/Devices/AirScanScanner.app
  moduleVersion                 : (null)
  moduleExecutableArchitecture  : 0
  type                          : 0x00000402
  UUIDString                    : 1C852A4D-B800-1F08-ABCD-8CDCD46A18B8
  persistentIDString            : 1C852A4D-B800-1F08-ABCD-8CDCD46A18B8
  autolaunchApplicationPath     : 
  capabilities                  : 
  shared                        : NO
  transportType                 : ICTransportTypeTCPIP
    bonjourServiceType          : _uscan._tcp.
    bonjourServiceName          : HP Officejet 4630 series [6A18B8]
    bonjourTXTRecord            : {
    UUID = {length = 36, bytes = 0x31633835 32613464 2d623830 302d3166 ... 64343661 31386238 };
    adminurl = {length = 28, bytes = 0x68747470 3a2f2f48 50384344 43443436 ... 382e6c6f 63616c2e };
    cs = {length = 22, bytes = 0x62696e6172792c636f6c6f722c677261797363616c65};
    duplex = {length = 1, bytes = 0x46};
    is = {length = 10, bytes = 0x706c6174656e2c616466};
    pdl = {length = 51, bytes = 0x6170706c 69636174 696f6e2f 6f637465 ... 6167652f 6a706567 };
    rs = {length = 5, bytes = 0x2f6553434c};
    txtvers = {length = 1, bytes = 0x31};
    ty = {length = 24, bytes = 0x4850204f66666963656a6574203436333020736572696573};
    vers = {length = 3, bytes = 0x322e30};
}
    ipAddress                   : (null)
    ipPort                      : 0
  availableFunctionalUnitTypes  : 
  selectedFunctionalUnit        : (null) <0x0>
  transferMode                  : ICScannerTransferModeFileBased
  downloadsDirectory            : file:///Users/mark/Pictures/
  documentName                  : Scan
  documentUTI                   : public.tiff

Done searching for scanners
Found scanner: HP Officejet 4630 series [6A18B8]
Opening session with scanner
didOpenSessionWithError: [no error]
deviceDidBecomeReady
didSelectFunctionalUnit:  error: [no error]
didSelectFunctionalUnit: ICScannerFunctionalUnitDocumentFeeder <0x7fa70ef09f30>:
  supportedBitDepths      : <NSMutableIndexSet: 0x7fa70ef0a880>[number of indexes: 1 (in 1 ranges), indexes: (8)]
  bitDepth                : 8
  supportedDocumentTypes  : <NSMutableIndexSet: 0x7fa70ef0d2a0>[number of indexes: 51 (in 10 ranges), indexes: (1-5 7-8 10 13 15-16 22-25 29-33 39-43 48-66 72-78)]
  documentType            : 3
  measurementUnit         : 0
  supportedResolutions    : <NSMutableIndexSet: 0x7fa70ef0a4b0>[number of indexes: 5 (in 5 ranges), indexes: (75 100 200 300 600)]
  preferredResolutions    : <NSMutableIndexSet: 0x7fa70ef0a4b0>[number of indexes: 5 (in 5 ranges), indexes: (75 100 200 300 600)]
  resolution              : 200
  supportedScaleFactors   : <NSMutableIndexSet: 0x7fa70ef0a300>[number of indexes: 1 (in 1 ranges), indexes: (100)]
  preferredScaleFactors   : <NSMutableIndexSet: 0x7fa70ef0a300>[number of indexes: 1 (in 1 ranges), indexes: (100)]
  scaleFactor             : 100
  supportsDuplexScanning  : NO
  duplexScanningEnabled   : NO
  vendorFeatures          : (
    "ICScannerFeatureBoolean <0x7fa70ef0a660>:\n  type                    : ICScannerFeatureTypeBoolean\n  internalName            : EdgeDetection\n  humanReadableName       : Enable edge detection\n  tooltip                 : (null)\n  value                   : NO\n"
)
  state                   : 0x00000001
 error: [no error]
Configuring scanner
Configuring Document Feeder
Starting scan...
didEncounterError: An error occurred during scanning.
Failed to scan document.
Done
klep commented 2 weeks ago

Ed - Thanks for the feedback!

At a glance, I believe the issue is that scanline implements black and white scanning by telling your scanner to use a bitDepth of 1 and a pixelData type of bw. I'm guessing your scanner doesn't have a mode that matches this. I don't see such a mode in the output above (thanks for including that!) but it's not immediately clear to me if that output includes all of the modes (it appears not, actually).

Two possible explanations for why it works with Image Capture:

1 -- Image Capture may be more savvy about the different ways that scanners report their capabilities. Maybe your scanner scans in black and white with a bitDepth of 2 for some bizarre reason.

2 -- Image Capture might have a fallback mode where if a scanner doesn't support black and white natively, it's done in software.

It's hard to tell without having access to the scanner to try some things. Are you able to do post-processing (maybe with ImageMagick?) to convert the scanned images to black and white?

ed2050 commented 1 week ago

Thanks @klep. Postprocessing works I guess. Scanning to pdf, so I need ghostscript not ImageMagick. I'd like to get it working with scanline though if possible. I also found that flatbed scanning doesn't work with scanline. Similar result as above.

I did some digging and found the SANE library works with the Officejet 4630. It seems their code uses 8-bit depth for grayscale, as noted by some of the comments. For instance see this driver:

/* Returns the data width that is send to the scanner, depending */
/* on the scanmode. (b/w: 1, gray: 8..12, color: 24..36 */
...
switch (sanei_hp_optset_scanmode(this, data)) {
  case HP_SCANMODE_LINEART: /* Lineart */
  case HP_SCANMODE_HALFTONE: /* Halftone */
      p->format = SANE_FRAME_GRAY;
      p->depth  = 1;
      p->bytes_per_line  = (p->pixels_per_line + 7) / 8;
      break;
  case HP_SCANMODE_GRAYSCALE: /* Grayscale */
      p->format = SANE_FRAME_GRAY;
      p->depth  = 8;  // <-- COULD BE BIT DEPTH??
      p->bytes_per_line  = p->pixels_per_line;
      if ( !sanei_hp_optset_output_8bit (this, data) )
      {
        data_width = sanei_hp_optset_data_width (this, data);
        if ( data_width > 8 )
        {
          p->depth *= 2;
          p->bytes_per_line *= 2;
        }
      }
      break;
  case HP_SCANMODE_COLOR: /* RGB */
      p->format = SANE_FRAME_RGB;
      p->depth = 8;
      p->bytes_per_line  = 3 * p->pixels_per_line;
      if ( !sanei_hp_optset_output_8bit (this, data) )
      {
        data_width = sanei_hp_optset_data_width (this, data);
        if ( data_width > 24 )
        {
          p->depth *= 2;
          p->bytes_per_line *= 2;
        }
      }
      break;

Can't tell if that's bit depth or data width they're setting. There's also BIT_DEPTH var (or macro) in there but I haven't tracked down the source yet.

I also tried connecting to my scanner via mac ImageCaptureCore using the python-to-objc lib. I managed to make an ICDeviceBrowser object, set the search mask for Bonjour scanners, and start a device search, but so far it doesn't return any devices. I think the problem might be that it's expecting some kind of ICDeviceBrowserDelegate, but I don't know what delegates are yet (never used objc or mac APIs before).

Will report if I make more progress on this. Hoping to query the device myself to see what flags it expects.

klep commented 1 week ago

Hey @ed2050 -- Thanks for following up.

In the Apple world, a delegate is an object that adheres to a defined protocol (what some languages would call an "interface"). So when the device browser finds a scanner, it'll call a function in the delegate. That said, I have no idea how to make a delegate work with a python-to-objc library.

I can try to create a branch where I use a bit depth of 8 for grayscale. And actually, the -bw flag is for black and white, not grayscale, so it wouldn't be what you wanted anyway. Not sure when I'll be able to get to it though.

If you have some time, you should try downloading and building scanline in XCode. You could experimentally change the -bw flag to look for a device mode with a bit depth of 8 and see if that works. Look at configureScanner in ScannerController.swift. And then to scan from within Xcode, you can look at and uncomment/edit the appropriate line in ScanlineAppController.swift's init. Then just run it from within Xcode.

Not sure what's up the flatbed. In the fullness of time, I'd also add a way to print out all the functional units of your scanner. 'Til then, a similar approach to above might be doable -- look at configureFlatbed()