ec-melodies / melodies-all

Community Central for the Melodies project partners
https://github.com/ec-melodies/melodies-all/wiki
1 stars 0 forks source link

Dataset catalogue metadata (DCAT etc.) #6

Open letmaik opened 9 years ago

letmaik commented 9 years ago

This discussion is about which ontologies/standards to use for creating and publishing dataset catalogue metadata. (Subdiscussion of #3)

Likely ontology candidates are:

A logical step for publishing these metadata and making them discoverable would be to host them somewhere and also ingest them into some catalogue which understands DCAT etc. @HerveCaumont @p3dr0 Is this something you're already working on? I have the impression somehow but don't have any concrete information or resources to look at.

This discussion should also track any issues on that topic, e.g. if some WP has metadata which cannot be easily expressed by the ontologies above but which they like to have anyway. Note that longer discussions should be handled on separate GitHub issues and linked via the hash syntax (#number) to this one.

jonblower commented 9 years ago

A few thoughts/questions on this:

  1. I assume that GeoDCAT-AP is backward-compatible with DCAT (i.e. we don't lose anything by adopting GeoDCAT) - is that right?
  2. I think that Andrea Perego of the JRC (he is one of our Project Advisory Board) was involved somehow in GeoDCAT - I can ask him.
  3. It seems to me that DCAT and VoID address slightly different things - DCAT for general datasets, VoID for RDF datasets and links between them. I think we probably need both, but can anyone find successful examples of them being used together?
conradbielski commented 9 years ago

This is partly a test to see whether my GitHub account works but also to get into the discussion. What about INSPIRE? They have also done quite a bit of work on such things I believe. This might be a good starting point: http://inspire.ec.europa.eu/index.cfm/pageid/2/list/datamodels But I will also ask the same question here: https://themes.jrc.ec.europa.eu and see if anyone has had more experience with this issue. Unfortunately, I haven't had the time to sift through the INSPIRE conference presentations (http://geospatialworldforum.org/proceedings.html) but something might pop up there as well.

jonblower commented 9 years ago

Hi Conrad, your account is working! Yes, I noted the INSPIRE question under #7. I think it's relevant to both dataset-level modelling and data-level modelling. Andrea Perego would be a good person to involve in this discussion too, what do you think?

conradbielski commented 9 years ago

Yes, I do think we should get Andrea involved at this point.

Sorry, I haven't had a chance to catch up with all the information/posts you and Maik have added since last week. But definitely appreciate the discussion.

ninopace commented 9 years ago

If I may summarise what I understand from these (and previous conversations), I see that: 1) DCAT/GeoDCAT/VoiD are suitable standards for metadata publishing. They provide resources to eventually access the data (e.g. download, web services, etc.) and are well integrated with search engines 2) No current standard exists to date for modelling RDF with domain specific data, but the SmartOpenData project mentioned here (https://github.com/ec-melodies/melodies-all/issues/7#) seems a good candidate for a starting point. I haven't check to what an extent this builds on top of O&M but, being INSPIRE-bound I guess it does. We shall check how it applies to different data types (e.g. wells measurements as opposed to protection sites), but I would expect it does.

So, it looks like (1) is suitable for all MELODIES datasets and allows to generally link all data we produce in a generalised manner (and also quite straightforward I would guess), while (2) is an additional RDF modelling level which goes deep into the data structure (e.g. allowing to query directly the individual measurements using GEOSparql).

Maybe for raster-oriented data (like those produced in WP6) the (1) is sufficient as long as it provides the resources to directly access the data (e,.g. WCS), while for vector data type we can add the further RDF description using an approach similar to SmartOpenData.

I would agree to try and contact Andrea Perego, to hear of the existing links and overlaps between the 2 models.

jonblower commented 9 years ago

Hi Nino - yes, I agree with your points. I'm not sure to what extent the INSPIRE model builds upon O&M but I'll ask Andrea. Perhaps he can join this discussion. In my experience the INSPIRE models are quite GIS-oriented and might need to be extended for time-varying and other "scientific" data types. But I guess the nice thing about RDF is that we can describe the same data in multiple ways at the same time if we want!

mattfry-ceh commented 9 years ago

Hi Jon. The CEH Lancaster team (Pete Vodden, John Watkins and others) have been implementing a catalogue based on the INSPIRE Environmental Monitoring Facilities (EF) schema for the UK Environmental Observation Framework: http://www.ukeof.org.uk/catalogue/about

The EF schema uses the Observing Capability to describe what is being monitored. From the INSPIRE spec: “The class Observing Capability is modelled to serve the need that a measurement regime can be described without providing the observed or measured value itself”. But it also provides means of linking to observations themselves described in O&M.

This might also be of interest: http://dx.doi.org/10.1080/17538947.2015.1033483

Matt