biocaddie / prototype_issues

Used to report and track bioCADDIE prototype issues
3 stars 5 forks source link

Harvesting of harvested content #117

Open ryscher opened 8 years ago

ryscher commented 8 years ago

Some of the content in Datamed that is labeled as coming from Dryad is actually content that Dryad has harvested from other locations.

For example: http://datamed.org/display-item.php?repository=0010&idName=dataset.title&id=56cf958de4b0cbfcaa680d55&query=dirt This item is an item that Dryad harvested from the Knowledge Network for Biocomplexity (KNB). The metadata would be much more rich if the item were harvested directly from KNB, rather than through Dryad.

tjohnson250 commented 8 years ago

Usability severity rating: 3 Major

This is a major issue because if we do not find a consistent way to address it now, the problem will mount as we add more data repositories that may be cross-harvesting. My suggestion, to be consistent and avoid confusion, is to track the harvesting and when both the harvested and original datasets are in the search results display both listed together with an obvious visual grouping, but if only one is in the results, display that one. In all cases, the results should have a prominent link to the other on both the search results page and the details page. The rationale is that if a person specifically limits search to a set of repositories, we should respect that wish, but help inform the user that there is a better version. If we get too much harvesting, we may need to show only the original (when multiple are in the results), then indicate using a pull down control that more versions are available.

We will need to test out how we can get this to work best with faceted browsing. It would be nice to use the more complete metadata even when the user restricts search to a repository with the harvested dataset, but that may not be possible with our current platform.