lmmx / page-dewarp

Document image dewarping library using a cubic sheet model
MIT License
109 stars 18 forks source link

ValueError: Number of samples must be non-negative, in function_base.py #9

Closed kylefoley76 closed 2 years ago

kylefoley76 commented 2 years ago

Here is the traceback when I use command line:

Traceback (most recent call last):
  File "/Users/kylefoley/Documents/pcode/venv4/bin/page-dewarp", line 8, in <module>
    sys.exit(main())
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/__main__.py", line 22, in main
    processed_img = WarpedImage(imgfile)
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/image.py", line 58, in __init__
    if not kind:
NameError: name 'kind' is not defined
(venv4) Admins-MacBook-Pro-4:downloads kylefoley$ page-dewarp 0252.jpg
got it
Traceback (most recent call last):
  File "/Users/kylefoley/Documents/pcode/venv4/bin/page-dewarp", line 8, in <module>
    sys.exit(main())
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/__main__.py", line 22, in main
    processed_img = WarpedImage(imgfile)
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/image.py", line 90, in __init__
    self.threshold(page_dims, params)
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/image.py", line 98, in threshold
    remap = RemappedImage(self.stem, self.cv2_img, self.small, page_dims, params)
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/dewarp.py", line 41, in __init__
    page_x_range = np.linspace(0, page_dims[0], width_small)
  File "<__array_function__ internals>", line 5, in linspace
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/numpy/core/function_base.py", line 130, in linspace
    raise ValueError("Number of samples, %s, must be non-negative." % num)
ValueError: Number of samples, -62447745571, must be non-negative.

However, I'm using the module within another module which I created and had to make some adjustments. The result is basically the same except that the number of samples is different. I have dewarped 244 images so it's not like this error happens a lot. Here is the trackback for when I used the module within another module:

Traceback (most recent call last):
  File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/kylefoley/Documents/pcode/other/ocr.py", line 1779, in <module>
    ins.begin(args[2],args[3:])
  File "/Users/kylefoley/Documents/pcode/other/ocr.py", line 180, in begin
    self.main_loop()
  File "/Users/kylefoley/Documents/pcode/other/ocr.py", line 276, in main_loop
    self.dewarp()
  File "/Users/kylefoley/Documents/pcode/other/ocr.py", line 324, in dewarp
    ins = dewarp.WarpedImage(self.pt, self.im1,self.dest4)
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/image.py", line 89, in __init__
    self.threshold(page_dims, params)
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/image.py", line 94, in threshold
    remap = RemappedImage(self.dest, self.cv2_img, self.small, page_dims, params)
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/page_dewarp/dewarp.py", line 41, in __init__
    page_x_range = np.linspace(0, page_dims[0], width_small)
  File "<__array_function__ internals>", line 5, in linspace
  File "/Users/kylefoley/Documents/pcode/venv4/lib/python3.8/site-packages/numpy/core/function_base.py", line 130, in linspace
    raise ValueError("Number of samples, %s, must be non-negative." % num)
ValueError: Number of samples, -82235987185, must be non-negative.

0252

lmmx commented 2 years ago

That error is from NumPy and is saying you can't make a np.linspace with a negative number

>>> np.linspace(0,1,-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 5, in linspace
  File "/home/louis/miniconda3/envs/zucker_dewarp/lib/python3.10/site-packages/numpy/core/function_base.py", line 122, in linspace
    raise ValueError("Number of samples, %s, must be non-negative." % num)
ValueError: Number of samples, -1, must be non-negative.

The full output when I run the program on your image is

Loaded 0252.jpg at size='2151x3265' --> resized='430x653'
  got 20 spans with 126 points.
  initial objective is 0.005445525413610505
  optimizing 154 parameters...
  optimization took 16.96 sec.
  final objective is 0.0008200064104791682
  got page dims -203076762.8361408 x 35.684414436800765
  output will be -331529607040x58256
Traceback (most recent call last):
  File "/home/louis/miniconda3/envs/zucker_dewarp/bin/page-dewarp", line 8, in <module>
    sys.exit(main())
  File "/home/louis/miniconda3/envs/zucker_dewarp/lib/python3.10/site-packages/page_dewarp/__main__.py", line 22, in main
    processed_img = WarpedImage(imgfile)
  File "/home/louis/miniconda3/envs/zucker_dewarp/lib/python3.10/site-packages/page_dewarp/image.py", line 80, in __init__
    self.threshold(page_dims, params)
  File "/home/louis/miniconda3/envs/zucker_dewarp/lib/python3.10/site-packages/page_dewarp/image.py", line 84, in threshold
    remap = RemappedImage(self.stem, self.cv2_img, self.small, page_dims, params)
  File "/home/louis/miniconda3/envs/zucker_dewarp/lib/python3.10/site-packages/page_dewarp/dewarp.py", line 41, in __init__
    page_x_range = np.linspace(0, page_dims[0], width_small)
  File "<__array_function__ internals>", line 5, in linspace
  File "/home/louis/miniconda3/envs/zucker_dewarp/lib/python3.10/site-packages/numpy/core/function_base.py", line 122, in linspace
    raise ValueError("Number of samples, %s, must be non-negative." % num)
ValueError: Number of samples, -20720600440, must be non-negative.

The negative number clearly slips into the page dimension (-203076762.8361408). A length is a scalar value and cannot be negative.

Specifically, it slips into the function get_page_dims in the image.py module:

def get_page_dims(corners, rough_dims, params):
    dst_br = corners[2].flatten()
    dims = np.array(rough_dims)

    def objective(dims):
        proj_br = project_xy(dims, params)
        return np.sum((dst_br - proj_br.flatten()) ** 2)

    res = minimize(objective, dims, method="Powell")
    dims = res.x
    print("  got page dims", dims[0], "x", dims[1])
    return dims

Stepping through this you can see the value computed for dims is way off from the initial estimate rough_dims:

> /home/louis/dev/page-dewarp/src/page_dewarp/image.py(34)get_page_dims()
-> res = minimize(objective, dims, method="Powell")
(Pdb) p objective
<function get_page_dims.<locals>.objective at 0x7f094dd1d480>
(Pdb) p dims
array([1.12698885, 1.93404471])
(Pdb) n
> /home/louis/dev/page-dewarp/src/page_dewarp/image.py(35)get_page_dims()
-> dims = res.x
(Pdb) p dims
array([1.12698885, 1.93404471])
(Pdb) n
> /home/louis/dev/page-dewarp/src/page_dewarp/image.py(36)get_page_dims()
-> print("  got page dims", dims[0], "x", dims[1])
(Pdb) p dims
array([-2.03076763e+08,  3.56844144e+01])

Without looking into this further my immediate thought would be that this flaw could be superficially smoothed over (if not quite fixed) by falling back to the rough_dims (which isn't meant to be used for that, it's meant to be the starting point for the minimisation). I expect the root cause is that the parameters are off somehow, hence the params value being passed to the optimisation are spoiling it.

Recall that the params are set in the __init__ method of the WarpedImage class, immediately followed by the get_page_dims call where the negative length shows up.

With that 'hotfix', a result is produced. I'll ship it as I don't have time to look into this further right now, and reviewing this code I'd need to improve the quality in various ways to make it more debuggable (this library is first and foremost a Python 3 rewrite of Matt Zucker's unmaintained Python 2 implementation).

lmmx commented 2 years ago

Patched in d22869e, will package in next release