Closed FlorianRhiem closed 2 years ago
I have an issue with this one:
If it's remote the @id
should be an URL starting with http. See: https://www.researchobject.org/ro-crate/1.1/data-entities.html#web-based-data-entities
The motivation there was not to describe the remote resource, but the link that was created to it by a user. Multiple different users might at different times enter links for different objects with the same URL, but a different (optional) description and title. One single File entry for the URL is not enough to reflect that, so this "virtual" file entry was used as a representation of that link.
The RO Crate section on web-based data entities assumes that these are available on the web via HTTP. While this might be true for some of the links stored for an object in SampleDB, this is not the case for all of them, e.g. a file, sftp oder smb link can be useful to document the location/URL of the linked content without it being available. For http or https scheme links, availability is not checked at any time either.
Should these links still be exported and made available in the .eln file, or should they be left out as something that the RO Crate isn't able to reproduce?
As RO Crate File objects are MediaObject objects, I suppose one way to represent the link nature of it would be to replace the url property with the contentUrl property?
The issue here is during import, this entry is referenced in an hasPart
section and as being a File
, but it is not a File
. So I think the @id
should really be the http link instead of a relative path that doesn't exist. Note that there is no issue referencing a samba share or something else than http (https://www.iana.org/assignments/uri-schemes/prov/smb), but we need a way to know if it's a local file in the archive or not.
The import code could also simply check for file existence and skip if it's not there...
I've removed these non-file File
entries and moved them, together with information on the files, to an extra per-object files.json
. I've also added a comments.json
to contain comments left for the object. Now the @graph
only contains the ro-crate-metadata.json
CreativeWork
, the root Dataset
for ./
, Dataset
entries for individual objects (experiments, samples, etc), Person
entries for users, and File
entries for actual files in the zip, with everything being referred via hasPart
or author
in a graph starting with the root Dataset
.
In a related note on remote content: I assume we only save the location of the content smb://... but not never any user credential information. And then we hope that the receiver of the ro-crate has also access to the content. Is that correct?
Yes, that's probably best.
Can we resolve this PR? We accept the PR as "work in progress", just as all the other ELN-software solutions are "work in progress" In the future, @FlorianRhiem can just push his example into the example folder (just as the other partners)
In the future, @FlorianRhiem can just push his example into the example folder (just as the other partners)
I believe it's best to make a PR for visibility of the change, and also we can discuss on it better. But I agree that we can merge WIP stuff, no troubles here. (and small changes can be pushed directly)
@FlorianRhiem I think for the comments we should use https://schema.org/Comment. And your author_id
would become a Person
instead in the author
property. eLabFTW also has comments, and I'll work on adding it too (current examples don't have comments EDIT: done). The comments are a very good example of a common property that is standardized :) The idea is to minimize the information dumped in random json files and add as much as possible in metadata json file.
A pic of how it currently looks after an import in elab:
Until now we have not discussed any field-naming-conventions. Since you guys now started talking 'comment', and 'person', .... I move this discussion to discussion, since it does not only relate to this PR but to the eln in general.
Here's a work-in-progress example of an .eln file generated by SampleDB. Feedback on the structure and possible improvements would be welcome.