nmontalva / ccaa-surnames

Code developed to analyse surnames data as part of Project Fondecyt N°11160402
3 stars 2 forks source link

Failing to parse some lists #1

Open nmontalva opened 6 years ago

nmontalva commented 6 years ago

While comparing my list of coordinates and commoners.csv I noticed some absent communities. Some of these are communities without an available list of commoners in OTCA, but some other are there and should be parsed:

No list available:

Not parsing, although a list is available:

nmontalva commented 6 years ago

I can confirm "Cerro Blanco y Gigante" and "El Espinal de San Pedro" are correctly donwloaded by download-pdfs.R. I cannot yet know why these are not parsed.

There seems to be an issue with "El Potrero". There are two communities with that name (one in Choapa and one in Elqui) and I think we discussed that back then with @rhz. download-pdfs.R does not download any file "El potrero", but it does download two files "El Potrero Alto": one from Elqui and one from Choapa.

According to Vergara 2005, which we consider the official source as convention, "Potrero alto" is the "potrero" in Elqui, and "Potrero" is in Choapa.

Despite having the same name, the path created by download-pdfs.R should be sufficient to distinguish "Potrero Alto" (Elqui) from "Potrero" (Choapa). However, the generated commoners.csv only contains "Potrero Alto" from Elqui, and does not contain "Potrero", not even by the name "Potrero Alto".