How should a Research Object be identified?

junszhao commented 10 years ago

We have explored the option of a PURL, a DOI, or some local de-refereable URIs. We have explored the option of providing machine-processable content for an RO URI.

But what is the best option for identifying a Research Object? And what are our options?

Before making a decision, what do we want to achieve through providing URIs for ROs?

What are our requirements for providing an identifier for a Research Object?

A machine-processable representation of the RO can be retrieved when the URI is de-referenced, through content negotiation
The URI is persistent and can be used to cite and refer to a RO
The URI follows a community consensus or recommendation
The URI scheme can allow us to retrieve different versions of an RO
The URI scheme can provide some trust on the RO

Any others?

Do we have a fit for all yet?

Bultako commented 10 years ago

I think it may be worth having a look at Trusty URIs, making web-based resources identified by URIs "verifiable, immutable, and permanent"

http://arxiv.org/abs/1401.5775v2.pdf http://www.slideshare.net/TobiasKuhn/trustyuris

I guess this is where the W3C should be going with respect to linked data persistency.

kmhettne commented 10 years ago

I think all the points you listed are important, and I would order them like this:

The URI is persistent and can be used to cite and refer to a RO (must)
The URI follows a community consensus or recommendation (should)
A machine-processable representation of the RO can be retrieved when the URI is de-referenced, through content negotiation (should)
The URI scheme can provide some trust on the RO (nice to have)
The URI scheme can allow us to retrieve different versions of an RO (nice to have)

julian-garrido commented 10 years ago

Another requirement:

The URI scheme can allow us to cite an element that belongs to the RO.

dgarijo commented 10 years ago

So far I have been using purl plus content negotiation, which covers points 1 and 3 from Kristina's list (e.g., http://purl.org/net/svm-opt-research-object), although we do not have any agreement on how to do this. In my case I use purl because it is free. An alternative would be to integrate it with systems like FigShare, which provide the means to clearly cite the RO or its contents. Regarding point 2, the community has had discussions for a long time without agreement on URI shcemes (for example, usage opaque URIs vs meaqningful URIs, etc), and I think it will be difficult. Point 4 could be covered with an approach such as the one proposed by Pique.

junszhao commented 10 years ago

@Bultako Thanks! You are the second or third one that I heard saying the same thing /this week/:)

@kmhettne Thanks! It is really to see your inputs. Would you say your preferences reflect the thinking from a user's point of view? Or better, scientists? :)

@dgarijo Thanks for your inputs! We don't expect the discussion to help us decide on a specific URI scheme to use. I think we are not there yet, for the entire scholarly communication community. But we want to solicit people's need when they provide URIs for an RO. For example, why did you use PURL? To gain persistence? To gain content-negotiation? And the free payment? Anything else?

rapw3k commented 10 years ago

I think would be good to take also a look to our discussion during the RO API specification, we discussed related to how to dereference an RO [1]. We also had some initial discussions on how to retrieve different versions of an RO, which I summarised at that time in [2], although this was not finally agreed. The final specification of the API (implemented in ROHub, and partially in myExperiment) thus enables different content negotiation as in [3]. I would be happy to contribute in the continuation of these ideas, maybe we can discuss a bit more on this in our next telco.

[1] http://www.wf4ever-project.org/wiki/display/docs/RO+dereferencing [2] http://www.wf4ever-project.org/wiki/pages/viewpage.action?pageId=3506725 [3] http://www.wf4ever-project.org/wiki/display/docs/RO+API+6#ROAPI6-GetaResearchObject

AlasdairGray commented 10 years ago

I would say that having a URI that provides trust in the data returned is more than a nice to have. The trusty URI approach would be a significant bonus in this regard.

pdrlps commented 10 years ago

URIs with some kind of "hashing" security mechanism to ensure data in the RO is valid.

junszhao commented 10 years ago

Folks, some related literature here:

Persistent #Identifiers for Scholarly Assets and the Web: The Need for an Unambiguous Mapping http://t.co/sYtEc7Td0v
10 non-trivial things @github, @figshare, @MozillaScience, and @ZENODO_ORG can do for science: http://t.co/c0ZIJJzSiq

Versioning seems another critical issue here.

micheldumontier commented 10 years ago

For RO, I would say

persistent
verifiable
versioned

I think you'd need one URI to provide a description, and another to return the object.

kmhettne commented 10 years ago

Identifiers.org (http://dev.identifiers.org/) came up in a discussion going on in our lab regarding how to reference data. Did you have a look at that already? Did not see anything on their site regarding versioning though.

ResearchObject / specifications

How should a Research Object be identified? #1