Closed fako closed 1 year ago
Line of code to check the counts for successful L4L date extraction:
Document.objects.filter(dataset_version=dv, collection__name="l4l").exclude(properties__publisher_date=None).count()
From 1 to 720 publisher_dates for L4L set.
Pivotal
The L4L XML data does not contain a "publisher" field in the "contribute" section, so the harvester can not determine a publication date (https://github.com/surfedushare/search-portal/blob/acceptance/harvester/edurep/extraction.py#L231-L242)
Solution is to also look for a date in the contribute field (after checking for existence of a publisher field)