Open CarlKCarlK opened 2 years ago
Good news -- I've found three sources of free information that I think will give us what we need:
@luillo1, I know little of databases but a lot about processing text CSV files, so how about I write a utility program that will merge these 100's of files into one reasonable CSV. It will have about a few 100K of rows and fewer than a dozen columns. In the short term, MemberMatch can use this one reasonable CSV. In the medium term, you can move it into a database if you want.
I saw @satvu yesterday at ESR Track and she said she found these sources, too.
I've created a tab-separated file that associates 250K names with their (approximate) probability.
@luillo1 & @MutatedGamer & @satvu The file is 5 meg. Is this OK to check in for now, or is it too big?
I imagine that you'll eventually want it converted to a serialized binary C# dictionary or a database table, but for now it would be useful to have it checked in. It would let me update the MemberMatch functions using this data.
The MemberMatch feature needs, as input, the (approximate) frequency of each member's first name(s) and last names(s). Some alternatives: