NaNoGenMo / 2016

National Novel Generation Month, 2016 edition.
https://nanogenmo.github.io
162 stars 7 forks source link

Bicycle Book #91

Open greyrainyskies opened 7 years ago

greyrainyskies commented 7 years ago

I'm trying to make a book generator that uses open data about the use of bicycle routes in Helsinki. Hoping to make something that lists the names of everyone passing the measuring point during a single day. So I'm also playing with this list of first names in Finland.

This is my first time participating in NaNoGenMo and I'm using this as an opportunity to learn the basics of Python.

hugovk commented 7 years ago

About 14% of Helsinki's population are of foreign origin, so perhaps some Estonian, Russian, English, Somali etc. names could also be included proportionally.

http://www.hel.fi/www/uutiset/en/tietokeskus/helsinki-is-home-to-over-one-fourth -> http://www.hel.fi/hel2/tietokeskus/julkaisut/pdf/16_01_15_Tilastoja_2_Hiekkavuo_Haapamaki_Ranto_Salorinne.pdf

greyrainyskies commented 7 years ago

Great note, I initially thought that the list would make up for that a bit as it does contains names of foreign-born Finnish citizens. My algorithm also boosts the probability of the lesser common names a tad but even then the probability for, for example, Tuula is 0,59 % whereas Ikraam is just 0,002 %.

Probably adding names based on the mother tongues would be to way to get more realistic mix. Just wondering where to find good lists of names for different languages...

hugovk commented 7 years ago

I guess you won't need as big or definitive lists for those.

If by country, and looking at figure 9 in the PDF, https://en.wikipedia.org/wiki/List_of_most_popular_given_names has top tens for most and follow the cites for longer lists. Chinese: http://www.chinawhisper.com/top-50-most-common-chinese-names/ Somali: http://www.somalinames.com/index.php/popular