Closed laurenwalker closed 5 years ago
TODO: define format types for Collections and Project pages.
Several commits have been made in the past week with updates to the project and collection schemas. Here is a summary of today's commits:
Broke the filter
element out into specific filter types (e.g. booleanFilter
, textFilter
, etc.) - https://github.com/NCEAS/project-papers/commit/71b59e5e81e959a093981dbbfdefeb22bfed8036
Removed the id
element from the collection definition - https://github.com/NCEAS/project-papers/commit/cd66aeb619fb65ace157e8bffc435cbd2592b427
Added branding colors - https://github.com/NCEAS/project-papers/commit/af20121e24cdf640b62bf17ba1d7b955dad47211
Added map options - https://github.com/NCEAS/project-papers/commit/a786f1dca67b6a9c107ba8e9e9a7870097a9ef35
Added ability to hide the metrics section - https://github.com/NCEAS/project-papers/commit/41460986deba341518dd5c311550b638ff23268e
The one part of the schema I haven't figured out quite yet -- logos.
How do we want to reference images in the project documents?
Some options:
A. (my preference) A simple element with a string value of an identifier of a logo image in the data repository B. A complex element of EMLEntity type. The image itself will be an object in the data repository, and will use the EMLEntity identifier field to reference the image. It looks like the project schema was originally designed this way, but I think it is overkill. If someone knows why it was originally designed this way, let me know. C. A simple element with a string value of a URL to a logo image anywhere on the web. The downside to this is that if an image URL is ever invalidated (host name changes, image is removed from the web server, etc.), the image won't show up in MetacatUI. Also subject to CORS issues.
Heya @laurenwalker - I could see both (A) and (B) working, and I agree that for a configuration schema, (A) is easier. @mbjones - thoughts on why we have a full eml-entity
tree here?
I'd avoid (C), yes, because of the issues you raise. If the logo image is stored in the repo, I'd suggest that the identifier reference be a seriesId
so the icon can change without a required project configuration change.
D. Another possibility is to store the images as inline, Base64-encoded strings directly as the <logo>
element value. It keeps the configuration together, but is a little more verbose because of the Base64 text. I don't have a strong opinion on this, but it's an option. Rendering inline images is a bout 10% slower than setting a <img src="https://somewhere...">
, but I'm not sure if that figure includes the time for the HTTP GET
call or not, and it may be a moot point for small logo images (inline may be close to the same rendering time). See https://css-tricks.com/data-uris/.
A few schema comments:
In the schema files, I would change import statements like:
<xs:import namespace="eml://ecoinformatics.org/project-2.2.0" schemaLocation="/Users/datateam/local_repos/eml/xsd/eml-project.xsd"/>
to
<xs:import namespace="eml://ecoinformatics.org/project-2.2.0" schemaLocation="eml-project.xsd"/>
If we are no longer extend the current eml-project
module, we can drop that import.
I'm wondering why we are calling the main xs:complexType
s in the project and collection schemas DatasetCollectionType
and DatasetProjectType
vs CollectionType
and ProjectType
?
The two schema files need an assigned namespace using the xmlns
and targetNamespace
attributes on the <xs:schema>
root element.
The current EML project module has:
<xs:schema xmlns="eml://ecoinformatics.org/project-2.2.0"
targetNamespace="eml://ecoinformatics.org/project-2.2.0">
We might consider eml://ecoinformatics.org/project-2.2.0beta1
and eml://ecoinformatics.org/collection-2.2.0beta1
, or something completely different so we don't have collisions with the current project module.
I'm not fully understanding the operator
(AND/OR
) being applied to a single Filter
instance. I would expect that we would be applying an operator
to a group of filters (to emulate a Solr parentheses block (keyword:Coho+AND+keyword:SASAP+AND+title:*McKenzie*)
). To me, a single Filter
operator would be like contains
or ends-with
or begins-with
or matches
(which cues us to use an asterix in the value), whereas a FilterGroup
operator would be AND
or OR
and would apply across all of the filters in the group. Do we need to define a FilterGroup
? Am I misunderstanding something here?
In the project schema, the ent:ImageListType
, but that type is not defined in the entity schema. Was this a proposed addition that never got in there? Lauren's proposal above would nix this anyway.
I'll leave it there for now, but am still reviewing. Looking good though Lauren!
I'm wondering why we are calling the main xs:complexTypes in the project and collection schemas DatasetCollectionType and DatasetProjectType vs CollectionType and ProjectType?
We could change the name. I just thought ProjectType
could be confused with the EML Project schema, so I added the dataset
qualifier. Not sold on it, though.
We might consider eml://ecoinformatics.org/project-2.2.0beta1 and eml://ecoinformatics.org/collection-2.2.0beta1, or something completely different so we don't have collisions with the current project module.
I was thinking we should keep these schemas outside of the eml namespace, since they are pretty Metacat and MetacatUI-specific and won't be used inside EML documents. Up for discussion, though.
I'm not fully understanding the operator (AND/OR) being applied to a single Filter instance.
The operator is set on a filter because each filter can have more than one value. The filter can use the operator
field to specify if those values are AND
or OR
ed together. Example:
<filter>
<field>origin</field>
<value>Chris Jones</value>
<value>Christopher Jones</value>
<operator>OR</operator>
</filter>
I think we decided not to include operators in filterGroup
s because we've always decided to not have advanced filtering options like that in MetacatUI. We could add it in to the schema though if we decide we want to support that in the future.
I've pushed some more changes to the schema based on Chris's feedback. I think we're getting close to finalizing it.
I added a ToggleFilter type to the project schema and rewrote the BooleanFilter type.
Commit: https://github.com/NCEAS/project-papers/commit/8935c4d67e3507c0fa891057666b3b142379ddc2
Summary of changes:
The BooleanFilterType will have the same exact fields as a text filter (field, value, label, etc.) except the value
will be restricted to booleans.
The ToggleFilterType has four additional fields: trueLabel
, trueValue
, falseLabel
, and falseValue
.
At this point, the schema for Collections and Projects is starting to get set in stone since we have code in Metacat and MetacatUI that depends on this schema. If anyone has any suggestions for schema changes at this point, let's address them within the next couple weeks.
Schema documents: https://github.com/NCEAS/project-papers/tree/master/schemas
I just sent out a last call for feedback via email to DataONE, NCEAS, and ESS-DIVE developers. After a week or so, I will tag a release for the schemas.
We need to develop a schema that will represent a collection and a project in our system.
Collections are first-class objects in Metacat. A collection is an aggregation of datasets created by a user for whatever purpose. They will probably have a minimal set of metadata such as:
A Project is a subclass of Collection. Projects have additional metadata such as: