globalgov / manypkgs

Support for creating new manyverse packages
https://globalgov.github.io/manypkgs/
GNU Affero General Public License v3.0
2 stars 0 forks source link

Better standardise treaty titles #91

Closed jaeltan closed 1 year ago

jaeltan commented 1 year ago

Add matches to EC and EFTA in countryregex and improve standardise_titles() so that states' names in treaty titles are standardised and matched consistently to reduce errors in manyID generation

henriquesposito commented 1 year ago

Maybe we should create a function that standardises titles, and updates treatyIDs and manyIDs at the database level, to avoid we run into all the issues we have so far. That would also make easier to keep all datasets in database (usually added at different times) consistent.

@jaeltan what do you think?

henriquesposito commented 1 year ago

We could also implement this at the export data level? So anytime a new dataset in added to database, we verify if IDs and titles are up to date and, if not, we update them in each dataset.

jaeltan commented 1 year ago

I think implementing it at the export_data step is a good idea. So if we add this step into export_data we can leave the treaty titles as is in the dataset and only improve how they are managed for creating the IDs in the final step?