lukehb / 137-stopmove

Algorithms to automatically discover stops and moves in GPS trajectories.
MIT License
23 stars 9 forks source link

How to use Kneedle.java for 2-D points? #1

Closed jagandecapri closed 6 years ago

jagandecapri commented 6 years ago

Hi @lukehb,

Referring to Kneedle.java in https://github.com/lukehb/137-stopmove/blob/94da86d39d3d8d2a946602fcdf6d8ccac5205ff1/src/main/java/onethreeseven/stopmove/algorithm/Kneedle.java;

How can I pass a set of 2-D points, i.e:

- Point 1:
-- x = 1.0
-- y = 1.0
- Point 2:
-- x = 2.0
-- y = 2.0

to the run method to find the elbow/knee in the data?

lukehb commented 6 years ago

The initial implementation of Kneedle that I wrote only had support for 1d data because that is all I needed. But seeing as 2d is what you are after I have added this functionality in a new commit 9d3c18ebee83aa00746b2feb7ce6fc07b4badce3.

You can see an example of finding the knee points in the data in the new test I wrote.

The javadoc for the run method should be give some hints about parameters:

/**
     * This algorithm finds the so-called elbow/knee in the data.
     * See paper: "Finding a Kneedle in a Haystack: Detecting Knee Points in System Behavior"
     * for more details.
     * @param data The 2d data to find an elbow in.
     * @param s How many "flat" points to require before we consider it a knee/elbow.
     * @param smoothingWindow The data is smoothed using Gaussian kernel average smoother, this parameter is the window used for averaging
     *                        (higher values mean more smoothing, try 3 to begin with).
     * @param findElbows Whether to find elbows or knees.
     * @return The elbow or knee values.
     */
    public ArrayList<double[]> run(double[][] data, double s, int smoothingWindow, boolean findElbows)
jagandecapri commented 6 years ago

Hi @lukehb,

Thanks a lot for the implementation of 2d points! It's very much helpful.

Regarding the original implementation using 1D points, where the documentation in the method run here says

     * It does this by sorting the data, then making a line between the start
     * and end data points in the sorted data. Each point in the data is the projected
     * onto this line, and the point with the biggest euclidean distance is considered
     * the most likely elbow.

Is it that given a set of points {x1: 1, x2: 5, x3: 4.5, x4: 3.5, x5: 4.9}, the algorithm sorts it as {x1: 1, x4: 3.5, x3: 4.5, x5: 4.9, x2: 5} and then plots the points as the following image where a straight line is drawn from x1 to x2?

p_20180204_223239 - copy

lukehb commented 6 years ago

In short, yes, that was the idea I was going for with the old 1d method. However, that old documentation was quite misleading as I was pre-sorting the data before passing it to run(), despite what the documentation said the method was doing. That is why I have removed that documentation in the current version. The 2d implementation in the repo now does not sort the data, it looks for the elbow/knee using the data in the order it is passed in. In general the current implementation is quite faithful to the Kneedle paper.

jagandecapri commented 6 years ago

Hei @lukehb ,

Sorry for the late reply. Yeap, got it. Your new implementation also seems fine for my use case. Thanks a lot for your help! :)