openpaperwork / pyinsane

Python library to access and use image scanners (Linux/Windows/etc) (Sane/WIA) -- Moved to Gnome's Gitlab
https://gitlab.gnome.org/World/OpenPaperwork/pyinsane
63 stars 24 forks source link

Pyinsane doesn't complete scanning #29

Open Feulo opened 7 years ago

Feulo commented 7 years ago

Hi there,

I am trying to automate sccanning a batch of document with a script in python using Pyinsane2. But when I Call the scan_session = device.scan(multiple=True) routine I get the mesage "Pyinsane: Stream::QueryInterface(): Unknown interface requested" several times, and the python and if i press enter it returns an empty scan_sessions.images, and doesn't proceed to the next sheet.

I am using python 3.6 in a Windows 10 with WIA2, the scanner is a Lexmark all-in-one Mx410de and the connection is over lan.

Any guesses on How to fix this? Thanks

jflesch commented 7 years ago

Hm, weird. This message means your WIA driver requested an object type (C++) that Pyinsane is unable to provide ( https://github.com/jflesch/pyinsane/blob/stable/src/pyinsane2/wia/transfer.cpp#L159 ). The problem here is figuring which one. I'm going to push a patch later to see if we can get the name of the interface requested (note to myself: https://msdn.microsoft.com/en-us/library/windows/desktop/ms688692%28v=vs.85%29.aspx ). Depending of your scanner driver, it may or may not be the root of your problem.

Regarding the empty scan_sessions.images, would it be possible to have a extract of your code please ?

Feulo commented 7 years ago

Hi,

The problem with the empty scan_sessions.images stopped when i start running the script on the command prompt instead of running in the python shell. I finally get the image of the scanned document, but it only scan 1 sheet from the ADF and then stops and exit. The code is basically the multiple scan example in the README.md (I uploaded it as a .txt because the I can´t upload a .py here). I'm also including the output on the command prompt thats different form the output in the python shell, I guess the problem is about the great number of warnings i get when running the script, but i don't know how to fix them output.txt teste.txt

jflesch commented 7 years ago

One thing that could be of help too is also the output of the script examples/list_all.py. It returns all the options supported by your scanner. Maybe something is not set right.

jflesch commented 7 years ago

BTW, I assume you installed Pyinsane using python wheels ? Can you tell me which version it has installed please ?

jflesch commented 7 years ago

There is another thing you can try : You can try Paperwork. If it doesn't work with Paperwork, we will know for sure it's a bug in Pyinsane.

Feulo commented 7 years ago

right, as soon as I return home I will do this and i post the results here.(probably by sunday)

Feulo commented 7 years ago

Hi there,

My pyinsane version is 'pyinsane2==2.0.9' and the pillow version 'Pillow==4.0.0'. I'm uploading the output of the list_all example, there's a lot of option that don't match the constraints, i guess this can be the problem as you said.

output_list.txt

Feulo commented 7 years ago

Hi, I tried with Paperwork and it worked. When I run teste.py it scan 2 pages and the stop. I guess the problem is to pyinsane to get the scanner signal inform if there's more pages on ADF or no. In Paperwork it scanned all the pages, but i had to inform paperwork the number of pages previosly.

When a I run the teste.py it scan the first 2 pages and stops, I noticed that when it stops the last page scanned it's still in the ADF, and the ADF cylinder that pulls the next page is in position to pull the next page (and not in its rest position).

After that, if i try to scan with paperwork the option "Scan from Feeder" is disabled. An if I run the teste.py again I get the error:

Got a page ! (current number of pages read: 1) Document feeder is now empty. Got 1 pages C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\pyinsane2\wia\rawapi.py:191: RuntimeWarning: Pyinsane: WARNING: source->transfer->Download() failed out.buffer, Pyinsane: WARNING: source->transfer->Download() failed Pyinsane: WARNING: source->transfer->Download() failed: -2145320954 ; Unknown error 0x80210006

Well, that's it, I hope that it is just a configuration of the scanner problem. Thanks again for helping

jflesch commented 7 years ago

In Paperwork, the number of pages is purely informal. It's just so, at the end, the user can be sure that the expected number of page has been scanned (no pages going as a single one in the feeder, etc). In other words, if it works in Paperwork, it should work from your script too :/

How many pages are you trying to scan in total ?

BTW, did you try the script examples/scan_adf.py ? It is very similar to your script, but has very very minor differences. If it works, it could give us a good hint about the problem here.

"Scan from Feeder" is disabled.

Well, that's a bug. I'm going to have a look. If you want to re-enable it manually, in your home directory (C:\Users\[login]), look for a hidden directory ".config", containing a file "paperwork.conf". Edit this file, and just replace the line has_feeder = False by has_feeder = True.

jflesch commented 7 years ago

BTW, I forgot to specify: I've seen that drivers (Linux or Windows ..) tend to be really buggy and have really weird and annoying behaviors when something unexpected happens. When the behavior is becoming really odd (like in the last output you provided), you .. may have to restart your computer before retrying ... :/

Feulo commented 7 years ago

I restarted my computer now and tried with Paperwork without setting the number of pages, the result was the same as the teste.py script, it stopped on the second page, with the bottom of the second sheet still in the ADF and the ADF cylinder in contact with the next sheet. I the tried again now setting the number of pages in Paperwork to 4 this way it scanned all the 4 sheets with no problem.

I will restart everything and try the scan_adf.py and post what happened here

Feulo commented 7 years ago

Well, I tried with the scan_adf.py it scanned all the four pages (yay!) but it retirned an unknown error at the end =/. I'm uploading the output here. (i don't know if this will be an issue, i will try to convert the images to a pdf inside the script and if it worked I'm happy lol. BTW I did another test with Paperwork, I have set the number of pages to 2. it scanned the first 2 pages, the started the third and stopped the same way as before, with th bottom of the page stucked on the ADF output_scan_adf.txt

jflesch commented 7 years ago

I have a feeling it's not really a problem of indicating the number of pages to Paperwork, using your script or using scan_adf.py. With your scanner driver, it seems just random :-(

Just a quick note : when I said the number of page doesn't matter to Paperwork, it's not exactly true actually .. if you ask for less pages than what is in the scanner feeder, it will stop as soon as it has what is wanted. (I should have specified that from the start I guess :/).

The error you got at the end of scan_adf.py is 0x80210003 = WIA_ERROR_PAPER_EMPTY. That's the first driver I see using this code ... I guess I will have to add it to Pyinsane.

jflesch commented 7 years ago

36af5fdb89f8a4b57c8e5bd27f4cfd29a1de6814 << I've added support for the error code returned by your driver.

BTW, just to be sure, did you try your feeder with any other scanning application supporting it ?

Feulo commented 7 years ago

Hi, I did a few more tests, I still did'nt figure out what's causing the problem, but I guess i founded a solution. I'm uploading 2 python scripts

the "scan" script is scanning only one page and the crashs with a error and the second page to be scanned stucked in the ADF.

the "teste" scrip is working just fine, it scans all the pages and saves them. It still gives the error message:

C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\pyinsane2\wia\rawapi.py:191: RuntimeWarning: Pyinsane: WARNING: source->transfer->Download() failed out.buffer, Pyinsane: WARNING: source->transfer->Download() failed Pyinsane: WARNING: source->transfer->Download() failed: -2145320954 ; Unknown error 0x80210006

but it scans all the images and I've doing the work i need without any problem

The funny thing is that for me the 2 scripts are exactly the same, I copied the scan_adf.py to make both. so I heva no idea of what is wrong.

Thanx for the fixing of the error code. I installed the pyinsane via pip, if I clone your repo and compile the will it work?

thnx for all the helping!! teste.txt scan.txt

jflesch commented 7 years ago

the "scan" script is scanning only one page and the crashs with a error and the second page to be scanned stucked in the ADF. the "teste" scrip is working just fine, it scans all the pages and saves them. It still gives the error message:

Except for the incorrect indentation in scan.txt (mix of tabs and 4-spaces), I can't see any difference between both scripts. I have to ask, are you sure that it's not just a problem with your driver / scanner / feeder / paper thickness ? In other words, have you already tried before with another application many times ? (other than Paperwork I mean)

if I clone your repo and compile the will it work?

Yes, but it won't be easy. You need Visual C++ and WinDDK.

I'll release a new version in the coming days and publish some Python wheels (usually Python 3.4 + 64 bits at least, but I guess I can make some more). It will probably be easier for you to install.

jflesch commented 6 years ago

Sorry, I forgot to tell you: I've released a version: 2.0.12 :/

Feulo commented 6 years ago

Hi Jerome,

Thanks for remembering me :)

I've just updated to the new version, it's working almost sawa way it was working with the previous version. Here the problems I got so far:

1) during the initialization I get lots of warnings about the properties and constraints ( the warings are basically the same as in the previous version), I'm uploading the code and outputs here so you can take look. 2) From the session options I needed (color, 300 dpi and ADF) only the ADF is really working, I still got the images in Gray Scale and 96dpi. (it was the same in the previous version) 3) if I try to use the maximize_scan_area() method, I got an error and it crashes (output_max.txt)

these are what i've got so far, but it's fulfilling my needs, thanks again for the help!

output.txt output_max.txt scan-relat-code.txt

jflesch commented 6 years ago

1) during the initialization I get lots of warnings about the properties and constraints

On Windows, this is expected. Pyinsane API doesn't really fit WIA API .. it's kind of putting a square peg into a round hole. So there are warnings .. I'm working on a new C library to replace Pyinsane. It shouldn't have these issues anymore.

2) From the session options I needed (color, 300 dpi and ADF) only the ADF is really working, I still got the images in Gray Scale and 96dpi. (it was the same in the previous version) 3) if I try to use the maximize_scan_area() method, I got an error and it crashes (output_max.txt)

Hm, weird. Can you try with IronScanner please ? It is always built with the latest version of Pyinsane from Git. It will also get more traces, making debugging easier.

Feulo commented 6 years ago

Just made 3 tests with the IronScanner

https://openpaper.work/en-us/scanner_db/report/99/ https://openpaper.work/en-us/scanner_db/report/100/ https://openpaper.work/en-us/scanner_db/report/101/

the first and second ones were with the flatbed and third one wit the feeder, it seems to me that it is scanning in color (based on the green circle in the image). In the last test I also fixed the manufacturer and the model to Lexmark mx410de

jflesch commented 6 years ago

@Feulo: I just want to be really clear on one thing: those reports are published publicly, scan included. Are you sure the documents you used are fine ? Looks like there is some info on one of them that could be private ? (I can delete them if you want me to)

jflesch commented 6 years ago

Hmm, when using the Flatbed, the scan was in color (as you said, there is a green circle correctly scanned). The resolution of 150dpi appears to be correctly set and correctly taken into account (the image has the expected size). And obviously maximize_scan_area() didn't crash.

For reference, the code for scanning in IronScanner: https://github.com/openpaperwork/ironscanner/blob/eb4c48be2efa7d044e21ff0a03c58d52d2005eb4/src/ironscanner/main.py#L710 (trace.trace() is just a wrapper that logs the call and adds a timeout --> it calls the function passed as argument almost directly).

I'm going to take another look at your scripts later.

Feulo commented 6 years ago

@Feulo: I just want to be really clear on one thing: those reports are published publicly, scan included. Are you sure the documents you used are fine ? Looks like there is some info on one of them that could be private ? (I can delete them if you want me to)

The only info on them are my name, email and birthday, that are easily found on facebook, so no problem heheheh but thanks for the heads up

I'll take a look in the reference code and do some tests here to.

jflesch commented 6 years ago

BTW, which version of Pyinsane is installed on your system ?