vipulnaik / donations

Donations list website (DLW): a repository for keeping track of public donations by some people I (arbitrarily) decide to track
https://donations.vipulnaik.com
Creative Commons Zero v1.0 Universal
20 stars 2 forks source link

Fix regex for donor and donee document list matching #126

Closed vipulnaik closed 4 years ago

vipulnaik commented 4 years ago

Current SQL query in https://github.com/vipulnaik/donations/blob/master/access-portal/backend/donorDocumentList.inc https://github.com/vipulnaik/donations/blob/master/access-portal/backend/doneeDocumentList.inc and https://github.com/vipulnaik/donations/blob/master/access-portal/backend/donorDoneeDocumentList.inc says affected_donors REGEXP <donor string> and affected_donees REGEXP <donee string> etc.

The problem is that occasionally, the name of one donee can be a substring of the name of another donee, so this can lead to false matches. An example is how "Ought" is a substring of "Forethought Foundation for Global Priorities Research"; see https://donations.vipulnaik.com/donee.php?donee=Ought

Ideally, we want to use a regex match against <start of string or pipe sign><donor or donee string><end of string or pipe sign> and update all SQL queries accordingly. I think we should be able to get a single regex that works for this; if not, in the worst case, we'll need to make a few cases.

riceissa commented 4 years ago

Here are some test cases I played around with: https://gist.github.com/riceissa/09638576b30132c6c7a3df83ec95ca14