dbpedia / extraction-framework

The software used to extract structured data from Wikipedia
860 stars 269 forks source link

Support different calendar systems #761

Open jimkont opened 3 months ago

jimkont commented 3 months ago

DBpedia should support multiple calendars, besides the Gregorian, which is the currently default.

On the mapping side, this could be done with special mapping directives in the mappings wiki. On the representation side, we would need to define a way to differentiate between different calendar systems (with different properties or datatypes), or perform a conversion on the fly, similar to how units of measurement are parsed.

_Originally posted by @TallTed in https://github.com/dbpedia/extraction-framework/pull/759#discussion_r1715888840_

It seems to me that this PR reveals a significant flaw in a number of systems. There ought to be a way to express a date in any calendar, which some systems might offer to translate to one or more other calendars, including Gregorian, Amharic, Hebrew, etc. Is translation among some of these easy? Lots of systems will support them. Is translation among some of these hard? Fewer systems will support them, or there will be a few libraries produced to handle such translations.

Net of all this -- I think there ought to be an issue about non-Gregorian calandar ingestion and preservation. This is not too different from the (ongoing) efforts to losslessly handle data using multiple geocoordinate systems for Earth plus other celestial bodies (Moon, Mars, etc.).