nfdi4plants / isa-ro-crate-profile

MIT License
4 stars 0 forks source link

What does `sameAs` point to in `LabProtocol`? #13

Closed HLWeil closed 2 months ago

HLWeil commented 5 months ago

Uri propery of ISA protocol is already covered by url field from schema.org thing

floWetzels commented 4 months ago

I'm not sure about this and can't remember why we added both. Could be that one is meant to point to a file within an ARC that describes the protocol whereas the other points to a true external resource like a website? Do you remember, @stuzart? Is it even a realistic scenario that both exist?

stuzart commented 3 months ago

@floWetzels yes, that's how I remember it too, it was to point to an actual file documenting the protocol. I don't think we were too sure about it at the time, but left it as it is. Either way, the description is wrong and just copied from shema.org. I also checked the original google doc and bioschema LabProtocol proposed changes and it's the same, and probably where it was copied from.

An alternative to @sameAs might be hasPart, from CreativeWork via HowTo, which could point to a dataset (or File in ro-crate) representing the file, just as we do for Assays ?

floWetzels commented 3 months ago

So are you saying that sameAs is necessary and something else than url? Because your description "an actual file documenting the protocol" basically matches what the description of url in the profile says. I'm a bit confused. @stuzart

stuzart commented 3 months ago

The problem with url is that in an RO-Crate if you are including an actual file, it needs to referenced as a File (an alias for MediaObject) which wouldn't be compatible with url, but would be with hasPart. url could be used if pointing to something elsewhere, e.g. protocols.io but not if you just want to include a doc or pdf file.

It may even be clearer if we just remove url and use hasPart for both cases. The RO-Crate docs describe using File for both files or web based entities https://www.researchobject.org/ro-crate/1.1/data-entities.html

floWetzels commented 3 months ago

I don't think that we need to consider any file as a data entity. As far as I can tell from the docs, we are free to choose if a file or a directory is data entity. It basically becomes one by connecting it to the root data entity through hasPart. So it should be perfectly fine to link to other files or external sources via other properties. Am I wrong here?

stuzart commented 3 months ago

Actually I've looked into it a bit more and yes, your're right, we just need url . If it is a file, we just need to give the filename as the @id , and use a @type of both File and Labprotocol e.g

{
      "@id": "my-lovely-protocol.pdf",
      "@type": [
        "File",
        "LabProtocol"
      ],
      ....
}

and then reference my-lovely-protocol.pdf from the root entity with hasPart.

So from the profile we can drop @sameAs and don't anything to replace it.

Sorry, I keep forgetting this ro-crate convension of using the @id as the file reference rather than a more explicit property.

HLWeil commented 3 months ago

So we would have 2 distinct ways to reference a digital protocol resource:

Or did I get this wrong?

stuzart commented 3 months ago

it's a bit confusing and I'm trying to get to clarification. The spec itself is known to be confusing, and is trying to be improved to ro-crate 1.2. But the gist of it is that:

in all cases there must be a hasPart referencing @id.

In our case, where it is an external web resource, I don't see any harm in also include url to point to it, which I think would make it fit with the bioschema profile and I'd imagine make it easier for parsers.

stuzart commented 3 months ago

My suggestion is that we include url but make it none mandatory, but recommended for an external web source. At the same time, the bioschema for LabProtocol should replace sameAs with url, (which is where it originally came from).

So in the ro-crate, for a file it would appear as:

{
      "@id": "my-lovely-protocol.pdf",
      "@type": [
        "File",
        "LabProtocol"
      ],
      ....
}

external web resource:

{
      "@id": "https://somewhere.com/my-lovely-protocol.pdf",
      "@type": [
        "File",
        "LabProtocol"
      ],
      url: "https://somewhere.com/my-lovely-protocol.pdf"
      ....
}

and if mixed, then

{
      "@id": "my-lovely-protocol.pdf",
      "@type": [
        "File",
        "LabProtocol"
      ],
      url: "https://somewhere.com/my-lovely-protocol.pdf"
      ....
}

The root ro-crate DataSet would include the following, but this isn't necessarily part of the Profile, but just part of the ro-crate spec:

"hasPart": [
        {
          "@id": "my-lovely-protocol.pdf"
        }
      ]
floWetzels commented 2 months ago

Fixed by PR https://github.com/nfdi4plants/isa-ro-crate-profile/pull/18