parkertomatoes / lttb-cpp

C++ implementation of the Largest Triangle Three Buckets (LTTB) downsampling algorithm
MIT License
31 stars 4 forks source link

[BUG]: Incorrect data window calculation (i.e., `every`) due to cast #1

Closed jonasvdd closed 9 months ago

jonasvdd commented 10 months ago

Hi @parkertomatoes,

Thank you for sharing your well-written and high-quality open-source code! 👏🏼

I think that there is a (small) bug in your code; specifically this line: https://github.com/parkertomatoes/lttb-cpp/blob/000046203c2da52b47085893771ee94423b06381/include/lttb.hpp#L32

Further explanation Consider a scenario where you define your data template as int, for example:

LargestTriangleThreeBuckets<TPoint, int, &TPoint::x, &TPoint::y> pointLttb =
    LargestTriangleThreeBuckets<TPoint, int, &TPoint::x, &TPoint::y>();

In this case, the variable every in the formula will be typecast to int, leading to the generation of floored windows. This behavior will persist throughout the for-loop and consequently might result in the omission of a significant data window at the end of the TPoint vector.


To overcome this issue; only one step needs to be taken:

  1. The bucket size needs to be cast to a double.
    // Bucket size. Leave room for start and end data points
    double every = static_cast<double>(sourceSize - 2) /
               static_cast<double>(destinationSize - 2);

Cheers, Jonas Van Der Donckt

parkertomatoes commented 10 months ago

Thank you for your very detailed bug report! I hadn't considered that anyone would use a non-floating-point data type for TData, and I'm surprised that's the only issue you encountered.

Your proposed fix seems reasonable. If you want to make a pull request for authorship sake, please feel free. If not, let me know and I'll go ahead and implement it.

jonasvdd commented 10 months ago

I made a PR @parkertomatoes! :)