ResearchObject / ro-crate-py

Python library for RO-Crate
https://pypi.org/project/rocrate/
Apache License 2.0
46 stars 23 forks source link

BUG: Duplicate references in hasPart on root dataset cause an error on loading crate #159

Closed ptsefton closed 11 months ago

ptsefton commented 1 year ago

So leaving aside that having duplicate entries in a hasPart property is probably a bug elsewhere and is undefiend in the spec it would be better if ro-crate-py didn't crash on loading such a crate.

Input crate

{
    "@context": "https://w3id.org/ro/crate/1.1/context",
    "@graph": [
        {
            "@id": "ro-crate-metadata.json",
            "@type": "CreativeWork",
            "about": {
                "@id": "./"
            },
            "conformsTo": {
                "@id": "https://w3id.org/ro/crate/1.1"
            }
        },
        {
            "@id": "./",
            "@type": "Dataset",
            "datePublished": "2020-06-25 17:03:04.098286",
            "hasPart": [
                {
                    "@id": "test_galaxy_wf.ga"
                },
                {
                    "@id": "test_galaxy_wf.ga"
                },
                {
                    "@id": "abstract_wf.cwl"
                },
                {
                    "@id": "test_file_galaxy.txt"
                },
                {
                    "@id": "https://raw.githubusercontent.com/ResearchObject/ro-crate-py/master/test/test-data/sample_file.txt"
                },
                {
                    "@id": "examples/"
                },
                {
                    "@id": "test/"
                }
            ],
            "mainEntity": {
                "@id": "test_galaxy_wf.ga"
            }
        },
        {
            "@id": "ro-crate-preview.html",
            "@type": "CreativeWork",
            "about": {
                "@id": "./"
            }
        },
        {
            "@id": "test_galaxy_wf.ga",
            "@type": [
                "File",
                "ComputationalWorkflow",
                "SoftwareSourceCode"
            ],
            "programmingLanguage": {
                "@id": "https://galaxyproject.org"
            },
            "subjectOf": {
                "@id": "abstract_wf.cwl"
            }
        },
        {
            "@id": "abstract_wf.cwl",
            "@type": [
                "File",
                "SoftwareSourceCode",
                "ComputationalWorkflow"
            ]
        },
        {
            "@id": "test_file_galaxy.txt",
            "@type": "File"
        },
        {
            "@id": "#joe",
            "@type": "Person",
            "name": "Joe Bloggs"
        },
        {
            "@id": "https://raw.githubusercontent.com/ResearchObject/ro-crate-py/master/test/test-data/sample_file.txt",
            "@type": "File"
        },
        {
            "@id": "examples/",
            "@type": "Dataset"
        },
        {
            "@id": "test/",
            "@type": "Dataset"
        }
    ]
}
from rocrate.rocrate import ROCrate
crate = ROCrate("temp")

ERROR:

python test.py
test_galaxy_wf.ga
test_galaxy_wf.ga
Traceback (most recent call last):
  File "/Users/pt/working/language-research-technology/ocfl-ro-crate-summarizer/test.py", line 5, in <module>
    crate = ROCrate("temp")
            ^^^^^^^^^^^^^^^
  File "/Users/pt/working/language-research-technology/ocfl-ro-crate-summarizer/env/lib/python3.11/site-packages/rocrate/rocrate.py", line 85, in __init__
    source = self.__read(source, gen_preview=gen_preview)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/pt/working/language-research-technology/ocfl-ro-crate-summarizer/env/lib/python3.11/site-packages/rocrate/rocrate.py", line 124, in __read
    self.__read_data_entities(entities, source, gen_preview)
  File "/Users/pt/working/language-research-technology/ocfl-ro-crate-summarizer/env/lib/python3.11/site-packages/rocrate/rocrate.py", line 148, in __read_data_entities
    entity = entities.pop(id_)
             ^^^^^^^^^^^^^^^^^
KeyError: 'test_galaxy_wf.ga'
simleo commented 11 months ago

This is now fixed as a side effect of #161.