EEXCESS / PartnerWizard

The PartnerWizard is created to build new PartnerRecommender for the EEXCESS Framework. This consist of the archetype and the webapp and a part for improving the generating the query for the PartnerRecommender.
0 stars 0 forks source link

[Data Mapping] Complex Date Time field mapped strangely in Europeana #1

Open mgrani opened 9 years ago

mgrani commented 9 years ago

Some date time objects in Europeana seem to be rather complex, e.g. http://europeana.eu/portal/record/2022041/10848_C5081F31_FD9A_4E63_8213_2DA9CA1578FD.html

The result on the client is

"date":"1911191219131914"

Which becomes quite unparseable. Is there a way to parse the data in the data mapping and harmonize it somehow?

Also date/times have different formats and it will be hard to convert them client side. e.g. this ZBW Resource uses yyyy-MM vs. most others using only yyyy. So is there a way to tell how date formats are provided? Do you check this for quality control?

mgrani commented 9 years ago

In addition, dates on KIM are not mapped properly, mostly because they do not follow a particular date format. Examples are Frühling 1967, 18906 (?). Would it be possible to do some fuzzy date matching?

mgrani commented 9 years ago

It seems with the KIM Portal there is no date mapping at all: The object Zeichnung, Napoleon am Lagerfeuer has a correct as it seems from the web portal, but EEXCESS returns unknown.

jr-dig-orgel commented 9 years ago

@KIMportal: I had a look at this: the problem is that the field datierung is a free text field and there can be also values like "1855 ca." https://www.kgportal.bl.ch/sammlungen#44a001e9-12dc-47f3-9f1f-ce43ab901651

mgrani commented 9 years ago

Maybe use HeidelTime to parse the string field as some kind of fuzzy date mapping? Alternatively you could try to strip away some of the most frequently occurring non-date text parts. However, HeidelTime woudl be a better solution.

P.s.: HeidelTime was recommende by Christopher Manning, one of most renowned researchers in NLP.