pelias / model

Pelias data models
6 stars 17 forks source link

fix(addParent): support any string value for abbreviations, use `null` for all others #109

Closed orangejulius closed 6 years ago

orangejulius commented 6 years ago

For some time, we have used null values to ensure Elasticsearch properly handles the optional abbreviation value in parent records.

As a result, we have been very strict in what we allow for values of the abbreviation field. Currently, the Javascript value undefined is converted to null (Elasticsearch has null but not undefined), and passing null in is forbidden.

However, this means that if we wish to take a value that has been already added with addParent and "re-add" it (perhaps after some processing, as in pelias/wof-admin-lookup#232), the model code will throw a validation error.

This PR changes addParent to accept any value with typeof string. All others are stored as nulls. After testing of importing around 5 million OSM records from around the world, no numeric abbreviation values was found, so this should cover all cases well.

As it relates to pelias/wof-admin-lookup#232, this fixes a problem where many many otherwise valid records will be skipped with postal cities admin lookup is enabled.

missinglink commented 6 years ago

how would you feel about making it even looser? like 'string' !== typeof loose?

orangejulius commented 6 years ago

I'm running a test now to see if we ever use anything except string values for abbreviation. If we don't see any errors, I'm happy to make that change.

orangejulius commented 6 years ago

Okay, after some testing of importing OSM records from California, Spain, and Mexico, I feel confident accepting any value with type string, and converting all others to null. The code has been changed and I've updated the PR title and description. @missinglink please review :)