joaofig / discrete-frechet

Implementation of the discrete Fréchet distance
MIT License
55 stars 12 forks source link

Problem computing the diagonal for a 3 x 9 matrix #15

Open estebanzimanyi opened 3 years ago

estebanzimanyi commented 3 years ago

Dear Joao Paulo Many thanks for this amazing work ! I have found a problem when computing the _bresenham_pairs for a 3 x 9 matrix. I simply used these values

    p = np.array([[80.0644976552576, 50.6552672944963],
                  [71.4585771784186, 63.2156178820878],
                  [19.9234400875866, 12.8415436018258]])

    q = np.array([[5.88378887623549, 11.4293440245092],
                  [84.2895035166293, 67.4984930083156],
                  [90.9000392071903, 36.4088270813227],
                  [34.2789062298834, 0.568102905526757],
                  [43.9584670122713, 75.5553565453738],
                  [24.4398877490312, 30.7297872845083],
                  [35.2576361969113, 39.8860249202698],
                  [62.438058713451, 44.4697478786111],
                  [38.4228205773979, 66.4192265830934]])

and I obtained the following error

$ python discrete.py
Slow : 9.959999988495838e-05
83.9133518061399
Linear : 7.949999962875154e-05
83.9133518061399
Traceback (most recent call last):
  File "discrete.py", line 667, in <module>
    main()
  File "discrete.py", line 625, in main
    distance = sparse_frechet.distance(p, q)
  File "discrete.py", line 484, in distance
    ca = _fdfd_sparse(p, q, self.dist_func)
  File "discrete.py", line 436, in _fdfd_sparse
    ca = _fast_distance_sparse(p, q, diagonal, dist_func)
  File "discrete.py", line 283, in _fast_distance_sparse
    d = dist_func(p[i0], q[j0])
IndexError: index 3 is out of bounds for axis 0 with size 3
$
joaofig commented 3 years ago

Hi, thanks for letting me know about this. I tried on my latest version of the code and it worked well. I pushed it to master for you to try on your end with the same data. Please make sure that you update numba, numpy and scipy to their latest versions.

estebanzimanyi commented 3 years ago

Alas there is still a problem

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
$ pip install numpy
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: numpy in /usr/lib/python3/dist-packages (1.17.4)
$ pip install numba
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: numba in /home/esteban/.local/lib/python3.8/site-packages (0.54.0)
Requirement already satisfied: numpy<1.21,>=1.17 in /usr/lib/python3/dist-packages (from numba) (1.17.4)
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from numba) (45.2.0)
Requirement already satisfied: llvmlite<0.38,>=0.37.0rc1 in /home/esteban/.local/lib/python3.8/site-packages (from numba) (0.37.0)
$ pip install scipy
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: scipy in /home/esteban/.local/lib/python3.8/site-packages (1.7.1)
Requirement already satisfied: numpy<1.23.0,>=1.16.5 in /usr/lib/python3/dist-packages (from scipy) (1.17.4)
$ python --version
Python 3.8.5
$ python discrete.py
Slow : 0.00011839999933727086
83.9133518061399
Linear : 2.8100002964492887e-05
83.9133518061399
Sparse : 5.280000186758116e-05
83.9133518061399
Fast : 2.609999864944257e-05
83.9133518061399

4.536398676779226 times faster than slow
1.0766285217832008 times faster than linear
---------
[2.8000031306874007e-06, 2.2000000171829015e-05, 1.1299998732283711e-05]
---------
[1.2999989849049598e-06, 1.0799998563015833e-05, 1.5000005078036338e-06]
free(): invalid next size (normal)
Aborted
$

When I commented out all the @jit directives then I have

$ python discrete.py
Slow : 9.850000060396269e-05
83.9133518061399
Linear : 8.689999958733097e-05
83.9133518061399
Traceback (most recent call last):
  File "discrete.py", line 668, in <module>
    main()
  File "discrete.py", line 626, in main
    distance = sparse_frechet.distance(p, q)
  File "discrete.py", line 484, in distance
    ca = _fdfd_sparse(p, q, self.dist_func)
  File "discrete.py", line 436, in _fdfd_sparse
    ca = _fast_distance_sparse(p, q, diagonal, dist_func)
  File "discrete.py", line 283, in _fast_distance_sparse
    d = dist_func(p[i0], q[j0])
IndexError: index 3 is out of bounds for axis 0 with size 3
$
joaofig commented 3 years ago

Ok, now I got it. Running from the command line instead of within PyCharm makes all the difference. I can now replicate the error. (Rolling up my sleeves...)

joaofig commented 3 years ago

The diag matrix contais this: [[0 0], [0 1], [1 2], [1 3], [1 4], [2 5], [2 6], [2 7], [3 8]]. The last entry should read [2 8] instead. As you said, there is a bug on the Bresenham line drawing algorithm.

estebanzimanyi commented 3 years ago

As you may have seen, I am reusing your results in MobilityDB https://mobilitydb.com/ and thus I need to translate the algorithms in C. The algorithm in https://gist.github.com/bert/1085538#file-plot_line-c solved the problem for me.

joaofig commented 3 years ago

Interesting bug... By adapting the code you mentioned to Python, I get exactly the same result.

    dx = abs(x1 - x0)
    dy = -abs(y1 - y0)
    n = max(dx, -dy)
    pairs = np.zeros((n, 2), dtype=np.int64)
    x, y = x0, y0
    sx = -1 if x0 >= x1 else 1
    sy = -1 if y0 >= y1 else 1

    err = dx + dy
    e2 = 0

    for i in range(n):
        pairs[i, 0] = x
        pairs[i, 1] = y
        e2 = 2 * err
        if e2 >= dy:
            err += dy
            x += sx
        if e2 <= dx:
            err += dx
            y += sy
estebanzimanyi commented 3 years ago

Weird indeed. My C implementation of the mentioned algorithm worked perfectly ...

$ ./frechet
First array:
80.064498 50.655267
71.458577 63.215618
19.923440 12.841544
Second array:
5.883789 11.429344
84.289504 67.498493
90.900039 36.408827
34.278906 0.568103
43.958467 75.555357
24.439888 30.729787
35.257636 39.886025
62.438059 44.469748
38.422821 66.419227
Number of elements in the diagonal: 9
(0,0)
(0,1)
(1,2)
(1,3)
(1,4)
(1,5)
(2,6)
(2,7)
(2,8)
$
joaofig commented 3 years ago

I think it is also not right. Although you don't get the nasty indexing issue, your line should read this: (0, 0) (0, 1) (0, 2) (1, 3) (1, 4) (1, 5) (2, 6) (2, 7) (2, 8) This is what we should expect for a 3 x 9 array, right?

estebanzimanyi commented 3 years ago

Fully agree :-)

joaofig commented 3 years ago

This is going to take a bit of time to sort out... :(

souvikmaji commented 1 year ago

Facing the same issue. @joaofig can you please share your c implementation? maybe the community can debug a solution.

estebanzimanyi commented 1 year ago

Maybe you can take a look at our implementation in MobilityDB https://docs.mobilitydb.com/MobilityDB/develop/ch05s13.html https://github.com/MobilityDB/MobilityDB/blob/develop/meos/src/general/temporal_similarity.c It covers temporal points but also temporal numbers.