TheELNConsortium / TheELNFileFormat

Specification for the ELN File Format
MIT License
41 stars 7 forks source link

rpace eln archive #53

Closed nhanlon2 closed 7 months ago

nhanlon2 commented 7 months ago

Hello,

Do you think it is doable from your side to include sha256 property to files? (https://schema.org/sha256)

I'm asking this because eLabFTW currently refuses to ingest a .eln with files that have no shasum (or an invalid one). This restriction was added after a pentester managed to manipulate the metadata file to extract the content of an arbitrary file. And I won't be able to remove it without making sure it doesn't open a vulnerability.

So if you can add the sha256 it's great. It also allows checking for data corruption so it's a plus!

Hi, I have added in Sha256.

jmanideep commented 7 months ago

Hello @nhanlon2 ,

We tried to import your example .eln file in Kadi4Mat and found the following issues.

As per the specification:

  1. The name of the folder inside the archive MUST be same as the archive name. Here, the archive name and folder name are different.
  2. Inside the root folder, there MUST be only a metadata file (i.e., ro-crate-metadata.json) and 0 or more folders. Here, there are some other files inside the root.
  3. The keywords in the other examples are a list of strings. Here, it is a single-string. And this should be discussed in the next meeting, as schema.org is not 100% clear about the value type for keywords.
NicolasCARPi commented 7 months ago

Yeah, for keywords a list of strings is better. But the schema.org says:

Multiple textual entries in a keywords list are typically delimited by commas, or by repeating the property.

So I guess they assume a comma separated list, and using it as an array is wrong. Personally I think readers can support both, and writers can stick to whatever makes more sense. I personally prefer the "list" approach, also because this removes the need to agree on a separator (in eLab the tag separator can be |, and , is accepted inside a tag). We can discuss that at our next meeting (12th of december).

nhanlon2 commented 7 months ago

Hi @jmanideep,

RSpace generates archives that do exactly match the folder inside, I deliberately changed the archive to a more human readable name, I can change it back if this is a problem. @NicolasCARPi shall I open a new PR? As regards additional files inside the archive, those files are specific to RSpace's existing import/export mechanism - the ro-crate-metadata.json contains all of the same information and is the only file that is requires for consumers of the .eln archive (except RSpace). If I remove these files, then I remove Rspace's ability to import these .eln archives.

NicolasCARPi commented 7 months ago

@nhanlon2 yes, new changes in new PR please.

nhanlon2 commented 7 months ago

@jmanideep @NicolasCARPi Hi - I have opened a new pull request. https://github.com/TheELNConsortium/TheELNFileFormat/pull/54

I realised what I said was wrong earlier and I am able to move all files inside folders to conform to the spec.