GoogleCloudPlatform / bigquery-utils

Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
https://cloud.google.com/bigquery/
Apache License 2.0
1.07k stars 269 forks source link

fixes jaccard bug #417

Open afleisc opened 1 month ago

afleisc commented 1 month ago

Closes #383

afleisc commented 1 month ago
const la = (sa.length > sb.length) ? sa.length : sb.length
const lb = (sa.length > sb.length) ? sb.length : sa.length

was added as an optimization to the double for loop but introduces a bug.

The incorrect loop limits could cause the intersection count (intersectSize) to be inaccurate, as not all possible character pairs were being compared.

afleisc commented 3 weeks ago

/gcbrun