NCEAS / metacatui

MetacatUI: A client-side web interface for DataONE data repositories
https://nceas.github.io/metacatui
Apache License 2.0
41 stars 27 forks source link

Misleading data file icon in catalog view #921

Open jagoldstein opened 5 years ago

jagoldstein commented 5 years ago

@mpsaloha pointed out that he was misled by search results when trying to find a social science dataset that contained data objects. This example https://arcticdata.io/catalog/view/urn:uuid:2d6da69f-606e-4c14-971b-58c05e560fb2 displays the "This dataset contains data files" icon in the search results on the catalog page view, as shown here:

screenshot2019-02-27

But, the only files contained are 4 child EMLs, and each child package is also devoid of data objects. The "data files [are included]" icon (shown here)

data_icon

should not be displayed in search results on the catalog page view unless there are data objects in the package; metadata files should not qualify.

laurenwalker commented 5 years ago

Unfortunately there is no way to know if the objects are data or metadata without sending an additional query to Solr, which slows down the UI. That icon and message are displayed by simply counting the number of pids that EML document "documents" via Solr. That resource map must have RDF statements that state the parent EML documents the other 4 EMLs.

I'll keep this issue open so we can hopefully find a way to fix this issue, but hopefully it is a rare occurrence until then.

jagoldstein commented 5 years ago

@mpsaloha

Thanks Lauren. With nesting, isn't the typical structure that the parent's RDF "aggregates" each child's RDF PID, not the child EMLs?

laurenwalker commented 5 years ago

Yes, that is how nested packages work. You had mentioned in the original issue description that the package was aggregating 4 child EMLs, so that is what I was referring to above.

jagoldstein commented 5 years ago

After downloading and reading the parent RDF, no child EMLs are aggregated. There are 5 aggregates statements: one for each of the 4 child RDFs, and one for the parent EML.

laurenwalker commented 5 years ago

Ok. It ultimately doesn't matter what kind of object it aggregates - the number shown with the icon is just counting the number of documents statements from the package.