In summary, the current implementation of this project splits the dataset randomly. However, it is much better to follow the philosophy of SPAIC: Each Raspberry Pi is a device connected to the Internet and each user can add their local names to the dataset. In that way, the privacy of such local contributions is preserved and at the same time, the machine learning model is enriched with new names from other parts of the world.
In summary, the current implementation of this project splits the dataset randomly. However, it is much better to follow the philosophy of SPAIC: Each Raspberry Pi is a device connected to the Internet and each user can add their local names to the dataset. In that way, the privacy of such local contributions is preserved and at the same time, the machine learning model is enriched with new names from other parts of the world.