OCR-D / quiver-back-end

The back end of the OCR-D quality dashboard webapp.
MIT License
1 stars 2 forks source link

Dependency management #14

Closed mweidling closed 2 years ago

mweidling commented 2 years ago

Description

This PR introduces two new attributes to the Repo class: dependencies and dependency_conflicts. They have the following structure (example shows the output for cor-asv-ann) in repos.json:

"dependencies": {
    "Keras": "2.3.1",
    "Keras-Applications": "1.0.8",
    "Keras-Preprocessing": "1.1.2",
    "Markdown": "3.3.7",
    "absl-py": "1.1.0",
    "astor": "0.8.1",
    "cycler": "0.11.0",
    "editdistance": "0.6.0",
    "fonttools": "4.33.3",
    "gast": "0.2.2",
    "google-pasta": "0.2.0",
    "grpcio": "1.47.0",
    "h5py": "2.10.0",
    "kiwisolver": "1.4.3",
    "matplotlib": "3.5.2",
    "numpy": "1.18.5",
    "ocrd-cor-asv-ann": "ocrd-cor-asv-ann",
    "opt-einsum": "3.3.0",
    "packaging": "21.3",
    "protobuf": "4.21.2",
    "pyparsing": "3.0.9",
    "python-dateutil": "2.8.2",
    "scipy": "1.7.3",
    "six": "1.16.0",
    "tensorboard": "1.15.0",
    "tensorflow-estimator": "1.15.1",
    "tensorflow-gpu": "1.15.5",
    "termcolor": "1.1.0"
},
"dependency_conflicts": {
    "absl-py": {
        "cor-asv-ann": "1.1.0",
        "eynollah": "1.1.0",
        "ocrd_anybaseocr": "1.1.0",
        "ocrd_calamari": "1.1.0",
        "ocrd_keraslm": "1.1.0",
        "ocrd_kraken": "1.1.0",
        "ocrd_pc_segmentation": "0.15.0",
        "sbb_binarization": "1.1.0"
    },
    "h5py": {
        "cor-asv-ann": "2.10.0",
        "eynollah": "3.7.0",
        "ocrd_anybaseocr": "3.7.0",
        "ocrd_calamari": "3.7.0",
        "ocrd_keraslm": "2.10.0",
        "ocrd_pc_segmentation": "3.1.0",
        "sbb_binarization": "3.7.0"
    },
    "protobuf": {
        "cor-asv-ann": "4.21.2",
        "eynollah": "3.19.4",
        "ocrd_anybaseocr": "3.19.4",
        "ocrd_calamari": "3.19.4",
        "ocrd_keraslm": "4.21.2",
        "ocrd_kraken": "3.19.4",
        "ocrd_pc_segmentation": "3.19.4",
        "sbb_binarization": "3.19.4"
    },
    "tensorboard": {
        "cor-asv-ann": "1.15.0",
        "eynollah": "2.9.1",
        "ocrd_anybaseocr": "2.9.1",
        "ocrd_calamari": "2.9.1",
        "ocrd_keraslm": "1.15.0",
        "ocrd_kraken": "2.9.1",
        "ocrd_pc_segmentation": "2.9.1",
        "sbb_binarization": "2.9.1"
    },
    "tensorflow-estimator": {
        "cor-asv-ann": "1.15.1",
        "eynollah": "2.9.0",
        "ocrd_anybaseocr": "2.9.0",
        "ocrd_calamari": "2.9.0",
        "ocrd_keraslm": "1.15.1",
        "ocrd_pc_segmentation": "2.5.0",
        "sbb_binarization": "2.9.0"
    }
}

The output stated is generated by

As output we create two different files: deps.json and dep_conflicts.json. The former lists all dependencies per OCR-D project while the latter makes transparent which packages have been installed by several OCR-D projects, but in different versions. In all cases the dependencies given in OCR-D/core are omitted because we assume that most OCR-D projects based on Python use this. Both files mentioned above are auxiliary files used by the Repo class and will be updated on demand (TODO).

Repo.dependencies shows a full list of all dependencies. There is no use case for this information yet, so we might decide to toss it. Repo.dependency_conflicts is a result of recognizing which projects have a dependency installed in different major versions; We rely on packages to implement semantic versioning correctly and assume that different major versions mean that there are breaking changes between the two versions. Cases where two or more projects have installed the same package in different minor or patch versions are ignored.

How to test it

Closes #11.

mweidling commented 2 years ago

@paulpestov Could you please give me feedback regarding the data structure?

paulpestov commented 2 years ago

I think the structure is alright. Maybe one thing, could we switch the key/value of version/project, so the frontend can recognize the affected projects easier:

{
 "tensorflow": "1.0.0"
}
mweidling commented 2 years ago

I think the structure is alright. Maybe one thing, could we switch the key/value of version/project, so the frontend can recognize the affected projects easier:

{
 "tensorflow": "1.0.0"
}

You mean within dependency_conflicts? Sure!