So, perhaps we need a mechanism for migrating base data to a yml file so that it can be edited to fix any anomalies at this stage.
The key canonical name would be latin as as far as I'm aware every known species is named in latin. We'd then ensure that the yml file also contained the label in English for finding it on wikipedia (which is likely to be latin or English) and the label in Welsh for finding the article on Welsh wikipedia.
Workflow for the migration would then be:
Run base migration to populate table of English and Welsh Wikipedia links (base table).
Run English migration to create nodes in Drupal
Identify incorrect nodes and fix by editing the base table
Run English migration again and repeat until it looks right
Run Welsh migration
Repeat fixes to base table for Welsh migration
Run Welsh migration again and repeat until it looks right
Once everything is correct we'd need to export the base table to code in order to preserve it.
The manual process could be made easier by replacing the base table with yaml file that can be edited in the IDE and saved to the repository. The file would have to be created automatically by the base migration process.
Using a canonical list of latin names (https://github.com/edwardcrompton/appnatur-rhan-cefn/blob/main/web/modules/custom/termau_migrate/data/species.csv) seems to make sense most of the time, but sometimes there are still disambiguation errors or the English article does not have the corresponding Latin title. (For example, the closest article for apple tree is just wiki/Apple in English).
So, perhaps we need a mechanism for migrating base data to a yml file so that it can be edited to fix any anomalies at this stage.
The key canonical name would be latin as as far as I'm aware every known species is named in latin. We'd then ensure that the yml file also contained the label in English for finding it on wikipedia (which is likely to be latin or English) and the label in Welsh for finding the article on Welsh wikipedia.
Workflow for the migration would then be:
Run base migration to populate table of English and Welsh Wikipedia links (base table). Run English migration to create nodes in Drupal Identify incorrect nodes and fix by editing the base table Run English migration again and repeat until it looks right
Run Welsh migration Repeat fixes to base table for Welsh migration Run Welsh migration again and repeat until it looks right
Once everything is correct we'd need to export the base table to code in order to preserve it. The manual process could be made easier by replacing the base table with yaml file that can be edited in the IDE and saved to the repository. The file would have to be created automatically by the base migration process.