datacommonsorg / website

Code for the Data Commons website
https://datacommons.org
Apache License 2.0
24 stars 88 forks source link

Avoid a buggy regex for comparison heuristic #4477

Closed pradh closed 4 months ago

pradh commented 4 months ago

This was added in very very early days in https://github.com/datacommonsorg/website/pull/2155, for comparison between two variables.

In practice, "compar.." or "vs" are the trigger words we use. There are too many potential unknown matches with "...er$". Even comparative words like "better" or "greater" need not necessarily mean comparison across vars ("which california county has a greater chance of ...?"). So lets just drop it!

There are no diffs in integration-tests. TODO: I hope to check the screenshot diffs after submitting this PR...

image

pradh commented 4 months ago

There are no diffs in integration-tests. TODO: I hope to check the screenshot diffs after submitting this PR...

Generally the diffs look reasonable to me: https://autopush.datacommons.org/screenshot/compare/2024_07_15_09_00_01...2024_07_15_17_00_05?domain=autopush.datacommons.org

One unexpected win with -- https://screenshot.googleplex.com/3kM9GcVHAX4wQq8 -- because we were suppressing "number" in "what is the number ..." before.