ESIPFed / science-on-schema.org

science-on-schema.org - providing guidance for publishing schema.org as JSON-LD for the sciences
Apache License 2.0
113 stars 32 forks source link

How do we specify that a Dataset has been cited (as an Action)? #123

Open ashepherd opened 3 years ago

ashepherd commented 3 years ago

Considering the Counter Code of Practice, can we map the Dataset activities, View, Download, and Citation to the schema:Action classes?

Activity schema.org Action class
Views schema:ViewAction
Downloads schema:DownloadAction
Citation schema:UseAction???

This would help,

1) Describing potential actions of a Dataset to applications 2) Give harvesting these analytics a shared vocabulary

mbjones commented 3 years ago

@ashepherd the Counter Code of Practice for Research Data does define a reporting format for standardized counts in which specific processing has been applied to eliminate repeat visits, define session windows, filter out search robots, and divide accesses into human-mediated versus machine-mediated counts. Personally I think those deserve a specific class in schema.org (or another vocabulary) that specifically maps to their semantics, so that these usage stats are not conflated with other usage stats that don't specifically follow those rules (and therefore are far less inter-comparable). The Make Data Count project has been working hard to promote consistent reporting using the Counter standards, and has had broad support from allied groups like the RDA Data Usage Metrics Working Group.

Regarding your specific proposal, I interpret all action classes, including schema:ViewAction and schema:DownloadAction to represent a single action of those types, and not to encapsulate a count of them. Does that make sense to you? That would argue for a different class to represent an aggregated count of such records.

Regarding citations, I'm curious how you see the schema:citation property, which seems like a closer match for a concept meant to provide details of a single citation event. It still doesn't represent a count per se, but could be used to link to a citing CreativeWork.

I think it would be excellent to include references to Counter-compliant usage reporting in a schema.org record, but I think it wold be disruptive to define new approaches to usage reporting. For an overview of the issues, see the short book by @dlowenberg et al. on Open Data Metrics.

ashepherd commented 3 years ago

Hey @mbjones, this (CCP) is what i'm implementing. maybe i didn't describe it well enough. I want to do two things:

1) Harvest our event tracking data back from google analytics (as described the CCP under their page tagging section), and store in our knowledge graph, and so want an RDF data model to apply to that. seemed like schema.org Actions was a decent place to start (as I don't see an RDF data model at CCP).

2) describing these view and download events in dataset landing page schema.org as schema:potentialAction so that views and downloads can occur elsewhere in the web of data without having to come directly to our site. Example: an email with schema.org markup to download a dataset, a tweet that shows a download button for a tweeted dataset, etc.

Does that make sense or am I missing something? I had read the Open Data Metrics book too, but I don't see a data model in there.

schema:citation is great for describing the citation, but to sum up citations for my case #1, you need an Action class to model the event of a dataset being cited by schema:object X. Then, harvesting that data over time, you can roll out your instances of those realized actions to build your COUNTER reports.

NOTE: schema.org used to model these activities with schema:UserInterationCount and schema:UserDownloads, etc, but they now recommend using the more flexible schema:Action model.