Open gothub opened 5 years ago
analysis of title should include topic (what), location (where), dates (when). The "who"/creator would not be included in the title.
The current check in ESS-DIVE and Arctic Data Center is for between 7-20 words. Should include the exact number of words to look for here to be clear. Will it be the same here?
Agreed @JEDamerow -- I added the word count to the description, and let's discuss here what the appropriate checks are. We've had some feedback that 20 words is too low for the max. In general, the title should be conveniently citable and recognizable as a title. More extended details should go in the abstract. A hard limit is of course arbitrary. We know that short titles are generally inadequate to describe the content and context of datasets. Maybe longer titles are acceptable? Would a 400 word title be acceptable? 200? 100? 50? 30? 20?
In order to check the title for "what", "when", "where", this check could extract other fields in the metadata and check that they are in the title. For example, in EML:
<temporalCoverage>
, check if any elements are in the tile<abstract>
, remove prepositions ("of", "at", ...), articles ("the"), ... and check for matches in title<geographicDescription>
, cleanup and check for matches Use similar files for the other metadata dialects.
The target for this check exists in the DataCite Dialect: data[*].attributes.titles[*].title. It is currently implemented as dataset.title.present.
Description
Check if the dataset title has an appropriate length and contains needed information to understand the dataset content and context
Priority
Choose a priority for the FAIR suite (Required or Optional)
Issues
what
,where
,when
Procedure