USEPA / EPA_Environmental_Dataset_Gateway

U.S. EPA’s Metadata Catalog
https://edg.epa.gov
3 stars 2 forks source link

Embed guidance in standalone JSON file #65

Closed torrin47 closed 5 years ago

torrin47 commented 5 years ago

Guidance will be available in the page for each element – not necessarily visible at first, but accessible without navigating away from the page. The content for this guidance should be loaded from a separate JSON file, so that the EDG team can maintain and update a single file of guidance that can be loaded into multiple editors (could be metadata technical spec page, Geospatial EME, ScienceHub or non-geo editor page).

Adopt standard structure.

torrin47 commented 5 years ago

epa-metadata-tech-spec.txt First pass at guidance/techspec schema.

aergul commented 5 years ago

I recommend changing the structure of the document slightly so it's easy to lookup a particular element as in the file I am attaching.

epa-metadata-tech-spec.txt

aergul commented 5 years ago

I also changed <pre> tag to <p> in guidance for title element as it doesn't allow for line breaking and leads to unpleasant experience on narrow screen devices.

torrin47 commented 5 years ago

Sounds great, the revised structure is fine, I'm not sure where the <pre> came from other than Drupal added it when someone made an edit with the GUI.

torrin47 commented 5 years ago

So I had a few minutes of downtime to ponder our symbol proposition for expanding the guidance/validation info. How about this?

Mandatory element, empty: https://fontawesome.com/icons/exclamation-triangle?style=solid Color: #fdae61

Optional element, empty: https://fontawesome.com/icons/question-circle?style=solid Color: #ffffbf

Any element, populated and valid: https://fontawesome.com/icons/check-circle?style=solid Color: #1a9641

Any element, populated and invalid: https://fontawesome.com/icons/check-circle?style=solid Color: #d7191c

@jzichichi ?

jzichichi commented 5 years ago

@torrin47 - I like the selections - it clearly separates out mandatory/empty from optional/empty, and then seems to make it clear that anything populated must be valid (mandatory or optional).

@aergul - passing to you to push back or implement.

jzichichi commented 5 years ago

@torrin47 were you planning to augment epa-metadata-tech-spec.txt to contain the full set of fields from the guidance or should I take a stab?

torrin47 commented 5 years ago

@jzichichi I was resigned to cranking through it myself, but I'd be thrilled if you could tackle it.

jzichichi commented 5 years ago

@torrin47 - mission accepted. I'll take a stab at it this afternoon/tomorrow.

jzichichi commented 5 years ago

@torrin47 @aergul - adding my first entry to epa-metadata-tech-spec.txt file (EPA Keywords)

Looking at EPA guidance to populate.

According to EPA guidance there is no POD element for EPA keywords: https://www.epa.gov/geospatial/epa-metadata-technical-specification#tags-epa-theme

Would I map it to the basic POD keyword element within this file? Maybe I don't need to? I was uncertain. https://project-open-data.cio.gov/v1.1/schema/#keyword

Apologies if I am off on this; I'm just dipping my feet into it. Thanks

torrin47 commented 5 years ago

Huh, the tech spec is definitely confusing on this point. Place and ISO theme tags map to specific elements in POD, but any other tags just get dumped into the keywords array, and I think we've been recommending that even place and ISO theme also get dumped into the keywords array for good measure. So really, all of the different categories of tags get dumped into the pod keywords array, it's just that EPA wants to ensure at least one keyword from each list, something above and beyond the POD guidance. Not quite sure how to clarify that many-to-one mapping in the tech spec, but it would make more sense to map to keywords than leave it empty or say "no equivalent". I'll get that updated.

jzichichi commented 5 years ago

OK, thanks @torrin47, I push forth with the file

jzichichi commented 5 years ago

@torrin47 - I have this file just about complete and I will send your way tonight. 2 questions:

  1. How do we handle the fact that 3 metadata elements map to a POD keyword element?
  2. Distribution URL - guidance indicates required for ALL but there is no element that it maps to in guidance document (lots of linkage options in distribution in POD schema tho). How would you like us to hand that particular item?
torrin47 commented 5 years ago

Oooh, yup, this is the tricky stuff.

  1. For the 3 keyword elements, I think we'll need to come up with our own element name for each of them in the guidance file - the tool will need to dump them all into the same array, and pull them back out of that array when importing existing files.
  2. In the EPA tech spec, Distribution URL pretty much maps to an entire Dataset Distribution section in a POD record, with either a downloadURL or accessURL populated. I'm again puzzled over why the tech spec says no equivalent - I'm going to fix that now. I have mixed feelings about this Dataset Distribution section. I love that a distribution URL can have a title and description, and that a dataset can have an infinite number of distributions. I think it's odd and unhelpful that there is a choice of URL fields, and the definitions or conformsTo, describedBy, format, and mediaType are ambiguous and overlapping- the schema would be more powerful with a single URL and better standards around those elements that characterize the URL. But for now they're what we have to work with.
    For the purposes of this tool, I think I'd be fine with grouping all of the distribution subelements under a single guidance element, rather than giving each distribution subelement a separate help. Not quite sure how that'll fly, but it's my initial inclination.
jzichichi commented 5 years ago

@torrin47 - OK thank you. I will take a stab at making up useful element names for the keywords section. @aergul - do you have any issues with the approaches listed above?

jzichichi commented 5 years ago

@torrin47 - sorry for so many inquiries on this. I suppose it is a good exercise to be doing.....the following 2 fields also map to the same POD field. I can also make them into 2 unique guidance fields but wanted to pass by you before making an executive decision.

https://www.epa.gov/geospatial/epa-metadata-technical-specification#tags-place https://www.epa.gov/geospatial/epa-metadata-technical-specification#spatial-extent

torrin47 commented 5 years ago

The POD "spatial" field is maddeningly overloaded. With so many options, I'm not sure how a machine is supposed to make much of it - unless they assume the machine reading it is Google maps that magically translates anything geospatial into a pushpin.

This field should contain one of the following types of content: (1) a bounding coordinate box for the dataset represented in latitude / longitude pairs where the coordinates are specified in decimal degrees and in the order of: minimum longitude, minimum latitude, maximum longitude, maximum latitude; (2) a latitude / longitude pair (in decimal degrees) representing a point where the dataset is relevant; (3) a geographic feature expressed in Geography Markup Language using the Simple Features Profile; or (4) a geographic feature from the GeoNames database.

For EPA non-geo records, our intention was to use option 4. Not quite sure how kosher that is if someone chooses many geographic features instead of a single one, but I think it's ok for MVP. Maybe we could capture as an enhancement request a function that would calculate a minimum bounding box that includes all of the selected geographic feature names? Very low on the priority list for non-geo records.

jzichichi commented 5 years ago

@torrin47 - Here is the next pass at the file. I hope I didn't completely screw up the content. I filled in as much as I could that would be important for the editor application, but I suspect there will be edits necessary for full completion. I didn't add the extra field you had suggested in a related ticket yet (EPA-specific guidance tag). I did customize the keyword tag names so that there is a keywordEPA and a keywordGeneral. Happy to see anything changed wherever you see fit. Please let me know if you have problems with this file - I included every element that was relevant across all standards as shown here: https://www.epa.gov/geospatial/epa-metadata-technical-specification

https://app.zenhub.com/files/98460260/0cde349f-2ad5-4b6e-8fb8-5b8a2844e508/download

aergul commented 5 years ago

There were a few small issues with the file:

Fixed those and the new file is here: epa-metadata-tech-spec.txt

Also of note: It appears that both Tags (Place) and Spatial Extent are mapped to "spatial" element. If intentional, we need to merge those entries in the file and decide how to handle that in the UI. If not intentional, is the spec wrong? @torrin47

jzichichi commented 5 years ago

@aergul thank you for fixing those.

I did ask Torrin about the 2 entries for spatial above and he had the answer below (perhaps you already saw this):

"The POD "spatial" field is maddeningly overloaded. With so many options, I'm not sure how a machine is supposed to make much of it - unless they assume the machine reading it is Google maps that magically translates anything geospatial into a pushpin.

This field should contain one of the following types of content: (1) a bounding coordinate box for the dataset represented in latitude / longitude pairs where the coordinates are specified in decimal degrees and in the order of: minimum longitude, minimum latitude, maximum longitude, maximum latitude; (2) a latitude / longitude pair (in decimal degrees) representing a point where the dataset is relevant; (3) a geographic feature expressed in Geography Markup Language using the Simple Features Profile; or (4) a geographic feature from the GeoNames database.

For EPA non-geo records, our intention was to use option 4. Not quite sure how kosher that is if someone chooses many geographic features instead of a single one, but I think it's ok for MVP. Maybe we could capture as an enhancement request a function that would calculate a minimum bounding box that includes all of the selected geographic feature names? Very low on the priority list for non-geo records."

In the short term, I am not too sure what to do about the 2 fields and mapping to the same element - the selection of option 4 maps more to tags (place) in my opinion, but I am not sure what the powers that be would say. Perhaps the file has a spatialTags and spatialExtent entry for those; I'm just not sure how the metadata itself would work - I think only one of these is allowed (unlike keywords)?

Also, should I be seeing the new content for all the help boxes in the UI on dev? I see the newest for many of them but some are missing (most of the keywords, as an example).

aergul commented 5 years ago

Thanks, @jzichichi. I saw @torrin47 's comments. My concern is not so much with POD schema allowing a choice of four different ways to populate "spatial" element but more with the fact that in EPA schema we have two separate elements that are being mapped into the same POD element. POD schema would accept either place tags or the spatial extent but not both.

This is one of the reasons I did not try to get all guidance mapped to the UI as I feel we have some work to do to get the EPA->POD mappings sorted out.

jzichichi commented 5 years ago

@aergul - makes sense. Thx. Will put on tomorrow's list.

torrin47 commented 5 years ago

Here's the latest version... epa-metadata-tech-spec.txt

jzichichi commented 5 years ago

@torrin47, we were wondering what the logic should be for showing the 3 different guidance fields for authenticated vs non-authenticated users? "guidance": "epaguidance": "externalguidance":

torrin47 commented 5 years ago

guidance: all epaguidance: only logged in EPA users externalguidance: hidden from logged in EPA users, visible otherwise.

aergul commented 5 years ago

Implemented now, including a temporary pretend login button to test EPA vs non-EPA user scenario.

Also, an attempt was made to standardize element naming in the app, EPA spec and internal config. The spec is missing some of the elements and could use help. Attaching latest.

epa-metadata-tech-spec.txt

jzichichi commented 5 years ago

@aergul - I tested this with license, and it worked for me; I think that is the only field with multiple guidance levels. Moving over for @torrin47 to review

torrin47 commented 5 years ago

👍

torrin47 commented 5 years ago

This issue was moved to USEPA/EPA_Non-geo_Metadata_Editor#24