kbrbe / beltrans-data-integration

Creating a FAIR Linked Data corpus for the BELTRANS research project about Belgian book translations NL-FR and FR-NL between 1970 and 2020
https://www.kbr.be/en/projects/beltrans/
MIT License
5 stars 0 forks source link

Geonames enrichment removes country if no match is found #252

Closed SvenLieber closed 4 months ago

SvenLieber commented 7 months ago

While working on #251 I noticed that the geonames enrichment script removes a country if no match is found. See example below:

Countries are removed in the file on the right, thus when that file is used to create RDF representations, the country is missing. If it is just used to add geonames identifiers and coordinates to already existing RDF (thus if the file on the left is the result of a SPARQL query), nothing will be broken, because the country URIs and labels are already part of the RDF.

image

SvenLieber commented 7 months ago

The parameter --column-country specifies the name of the column in the output file in which found country information is stored. However, if I provide the name of an existing country column that already contains values, it will be overwritten.

So technically I should provide a new column name such as found-country. But this is apparently not intuitive. In case the specified country column already exists, I should take the value it already has.