Open mbjones opened 9 years ago
Cool, any reason why not to do an Elasticsearch schema too?
Probably not -- seems like they could share most of the fields anyways, and the main difference would be in representation and syntax. What do you think @vdave ?
No reason at all. It would be great to come up with a technology agnostic set of fields that could augment existing simple cores such as Dublin Core and Darwin Core with properties generally useful to earth sciences. The "Earth Core"?
Okay, cool
@mbjones I am actively developing Apache OODT [0] which we (The Jet Propulsion Laboratory) use for cataloging scientific data including metadata. We've got a fairly good idea of how and what we wisht o catalog, however there is always room for potential improvement. I would hterefore be interested in contributing towards this workshop.
@lewismc Glad to hear you are interested. It would be great to discuss the overlaps between the OODT approach to cataloging metadata and the DataONE metadata index which @vdave and I contributed to and that inspired this activity. I think there's a lot of room for this kind of cross-standard agreement on metadata terms. Looking forward to it at the Codefest.
Yep, agreed, i I think we could probably have a lot to look at in terms of the overlaps. We've already already done through OODT a Solr science data schema via the OODT File Manager component, so we could draw from that.
Initial work on this topic is in the google sheet:
https://docs.google.com/spreadsheets/d/1VnEF0oezHlP2U98mmbZR9iNSSJjIMsgambTMOinnvSA/edit#gid=0
Also some notes in etherpad at:
Organizational Page: SOLRMeta Category: Data science Title: Develop a Common SOLR Index Schema for Cataloging Science Metadata Proposed by: Dave Vieglais Participants: Summary: SOLR is an open source, scalable, high performance search engine that can be used for searching broad categories of information. The goal of this session is to develop a SOLR Schema that enables effective search against common scientific metadata formats, with emphasis on the earth sciences. Such a common schema could be leveraged by many data repositories to provide consistent discovery semantics against their repositories while not precluding more specialized capabilities appropriate for specific repositories. Technologies: XML, SOLR, Lucene, EML, FGDC, ISO19115, Dublin Core, Darwin Core