USEPA / EPA_Environmental_Dataset_Gateway

U.S. EPA’s Metadata Catalog
https://edg.epa.gov
3 stars 2 forks source link

Allow multiple download links to be shown in search results #87

Open torrin47 opened 5 years ago

torrin47 commented 5 years ago

Requested by ScienceHub team, proposed limit of 6 download links (up from 1) and improved titles for links.

From: Hultgren, Torrin Sent: Thursday, December 21, 2017 5:56 PM To: Felsher, Maxwell (CGI Federal) maxwell.felsher@cgifederal.com; edg edg@epa.gov Cc: Montilla, Alex Montilla.Alex@epa.gov; Katie.French@cgifederal.com; Lantier, Dane (CGI Federal) dane.lantier@cgifederal.com; Lewis, John E (CGI Federal) john.e.lewis@cgifederal.com Subject: RE: "Open" link at bottom of EDG details page when there are multiple files

Hi Max,

This is a tough one. Esri’s GeoPortal server is really architected on an assumption that one metadata record represents a single thing with a single URL. They have some special code that spins up multiple links if the single URL is an Esri Map Service, but all of those are standard ways of interacting with that same map service. We really prefer the Data.gov philosophy that a single data asset might be available via multiple endpoints supporting different audiences or use-cases, but in order to go all-in on that philosophy, we’d need to switch to a different underlying product, and within the last few years we’ve done more than one comparative analyses of GeoPortal Server and competing products and concluded that despite its numerous shortcomings, it still is on balance the best fit for our needs. We anticipate reevaluating in another couple of years, but for now it’s a known limitation. The section at the bottom corresponds to what is displayed in the search results or via the REST API. I believe it was tacked onto the bottom of the details page to provide those expanded links to Esri services or raw metadata. It’s probably not out of the question to adjust the wording of those links or add additional links, but there’s no way we’d be able to support 50 links in the search results or rest API. What I see in data.gov is a max of 6, with a count of additional links:

I’d need a little more time to investigate the level of effort necessary to implement this approach, but if we were to do so, do you think it’d work for your community?

Torrin

From: Felsher, Maxwell (CGI Federal) [mailto:maxwell.felsher@cgifederal.com] Sent: Tuesday, December 19, 2017 3:46 PM To: edg edg@epa.gov Cc: Montilla, Alex Montilla.Alex@epa.gov; Katie.French@cgifederal.com; Lantier, Dane (CGI Federal) dane.lantier@cgifederal.com; Lewis, John E (CGI Federal) john.e.lewis@cgifederal.com Subject: "Open" link at bottom of EDG details page when there are multiple files

Hi EDG team,

One of our users sent us an email because she thought that only her first data file was being shown in EDG. Our team was very confused by this email, since we could see all three data files. Then we realized she was referring to the "Open" link (next to "Details" and "Metadata") in that widget of sorts near the bottom that contains the title and a bit of the abstract/description. That link points to the first file from our list. For reference, the user was looking at https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=https://doi.org/10.23719/1371707 and the image below shows the section in question.

It wouldn't have occurred to me to look in that section for the data files instead of looking in the "Distribution Information" section, but apparently that's what this user did, and it confused her to find only the first file.

Any thoughts on how we might make this clearer? What information are we trying to convey with this section? Maybe we need text that makes it clear that the "Distribution Information" section is where they should get data from.

I know we still owe you a response on the DOI question you mentioned last week. We hope to get to that soon!

Thanks, Max Felsher Sciencehub team

aergul commented 5 years ago

@torrin47 in terms of improvement to titles, were you thinking of sth like this?

image

And the behavior to apply globally to all gpt functionality utilizing the same REST endpoint?

image

Have it at 3 links at the moment but will up to 6 assuming that is still the target?

torrin47 commented 5 years ago

Wow, yes, that's exactly perfect, and I do see that already with only 3 it is starting to look a little cluttered. But the majority of the records won't have too many. I wonder whether it wouldn't make sense to have the 6th link simply say "More Downloads..." and link to the Details page? We could probably also get rid of the "Open" link, since it just points to the first download.

aergul commented 5 years ago

Hmmm, while gpt appears to prefer a "reference" (typically a link to some download), it doesn't always end up with a downloadable link. E.g.: image

I think it's fine to remove the "Open" link even in such cases but I thought I'd point out. One can check out the implementations of determineResourceUrl () in source code for understanding how the link is picked.

As for "More Downloads...", we will run into the same problem of repeating an existing link, so maybe we just display some non-linked text instead?

More downloads at the "Details" link

aergul commented 5 years ago

Also, is there a known example of a record with 6+ links?

torrin47 commented 5 years ago

On the links that aren't actually to downloads, that's tough to audit/prevent/validate, particularly with FGDC CSDGM records which use vanilla tags and a we assume an ordering convention to distinguish meaning. Can we just force the link to open in a new tab rather than assuming a direct download?

You're right about repeating links with my "More Downloads" - I guess my thought process was that it was more concise to offer a "More Downloads" link, even if it's a repeat of the Details link, than to write out the text that says go use the "Details" link.

Here are a couple of examples of records with 6 or more links: https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=https://doi.org/10.23719/1375008 https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=https://doi.org/10.23719/1407575

And a nasty one with 5 ugly URLs as titles: https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=A-cvdz-267

aergul commented 5 years ago

Here is an attempt that I think captures your plans: image

I avoided use of "download links" as these links do not always point to a file. I am not hung up on it though.

This does use up more vertical space and runs up against the available space on the details page:

image

So we should either do without line breaks or expand the space on details page. Without line breaks, we might need some other way to space out links so that they don't look like one very long link. Open to ideas...

BTW: Interesting choice of metadata there, the lead author is a beloved colleague!

torrin47 commented 5 years ago

I do like the look of the separate section of resource links, and totally approve of that title over "download links". It shouldn't be difficult to increase the height of that iFrame space on the details page if we choose that path. A couple of thoughts, though.

My apologies, once I start thinking about ways to improve the EDG search results, it's easy to get carried away.

aergul commented 5 years ago

image

image

image

image

image

torrin47 commented 5 years ago

Dang, you're right, the pipes really don't look that bad and might have the edge by virtue of being more compact. Loving all the rest of the tweaks, thank you!

I've spent a lot of quality time with those indexable files and would feel very comfortable routing the various links to your new scheme.

This feels close to being done, thoughts on next steps for staging deployment and user-acceptance testing?

aergul commented 5 years ago

Originally we were filtering URLs at the point of output so that the behavior applied only to the particular endpoint. The search results page does not use the same endpoint. Instead it uses a very convoluted combination of java, jsp, jsf, dojo and custom code all working together to generate the page. Due to this, filtering has been moved to an earlier phase in the process which should presumably apply to all geoportal outputs that rely on the so called "ResourceLinkBuilder".

As far as formatting of the links go, search results page mimics the rest endpoint except there are no pipes as separator. Introducing the pipe (or any other character for that matter) led to oddly placed separators - which makes me think there is some post-page-load DOM manipulation occurring. Since the width of the space where links are placed is fixed, it is much less likely for many links to end up on the same line. In other words, the need for separators may not be there.

We can re-attempt to introduce separators but it looks a bit like hornet's nest in there.

torrin47 commented 5 years ago

Yeah, I've seen how crazy the search page is, and I appreciate how challenging it's been to work with. The latest pull request fixed a bunch. Only outstanding issue is the appearance of some odd X characters in the resource links list (top view). I agree that in this context, there's less of a need for separators, but I find myself missing the Resource Links header shown in the REST endpoint. How rough might it be to add in headers (bottom view)? image You may well be right about some DOM manipulation we inherited when adopting the newer EPA look-and-feel template, but it might also just be an artifact of the float (this site is pre-flex. well, it's pre-lots of things). When I was tinkering with adding the extra text in Chrome devtools I found that by adding the text in spans with class "resultsLink" everything aligned itself gracefully (with no class, the text bounced around in unpredictable ways. Still think you're right about not needing separators in this instance, though. If we do make explicit the distinction between "EDG Links" and "Resource Links", we might need to review a few of the other links. For example, the standard link labeled "Website" probably belongs under "Resource Links" rather than "EDG Links". And "Zoom To" is an "EDG Link". "Add to Web Map" is a grey area, but probably is fine to keep in "EDG Links". Do we have control over which of these appear above and below the fold? Thanks again and have a great weekend!

aergul commented 5 years ago

Had sent another PR for the 'X' but you may not have noticed. Once you close that PR, I will open another where the new requests are addressed. In other words, yes we do have control over which link goes where - with some repetitive code. While we could reclassify, say "website" link as a resource link by assigning corresponding tag, that has global impact in gpt and code that is looking for the "website" link elsewhere won't find it, such as in here.

Instead, we introduce the behavior when we are writing out the links which is done in different places, hence the repetition.