Open bmschmidt opened 6 years ago
Note that there are a variety of labels for the founding date. I wrote an alternative python script that scraps Wikipedia (can share it if useful) and some of the labels that I found include:
'founded', 'incorporated', 'settled', 'established', 'platted', 'chartered', ...
Yeah, I also noticed 'platted' and saw this would require a bit of text mining to build up a list.
I imagine this would have to be a hierarchy of regular expressions. A "founded" date is generally better than an "incorporated" one. But I don't know how to resolve some of them: 'settled' is vague (it also may include populations not counted by the census, like native settlements and pre-cession Mexican towns/missions, which are treated a little ambiguously by this repository.)
Would love to see your python script.
Wikipedia frequently includes--either as structured text or in the first few paragraphs--the establishment date for a city. This would be useful for thematic mapping. If you make a map of Georgia in 1836, you would like to use some of the 1840 cities; but others may not be established until the end of the decade.
This is especially important when trying to compare city locations against other variables--e.g., railroads or Indian land cessions.