Project-MONAI / model-zoo

MONAI Model Zoo that hosts models in the MONAI Bundle format.
Apache License 2.0
190 stars 67 forks source link

Separate version and name from model_info.json #79

Closed SachidanandAlle closed 2 years ago

SachidanandAlle commented 2 years ago
"spleen_ct_segmentation_v0.1.0": {
        "checksum": "18bae0a62d90fb06b33c1c7702867a4d4d7b98f0",
        "source": "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/spleen_ct_segmentation_v0.1.0.zip"
    }

Instead of having version as part of name.. better to have something like following:

"spleen_ct_segmentation": {
        "version": "v0.1.0",
        "checksum": "18bae0a62d90fb06b33c1c7702867a4d4d7b98f0",
        "source": "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/spleen_ct_segmentation_v0.1.0.zip"
    }

This will enable users to consume latest version of bundles from name...

tangy5 commented 2 years ago

This is better! Will update this, and check compatibility to other MONAILABEL and bundle scripts.

tangy5 commented 2 years ago

Regarding the bundle version control. If we have the model_info.json as Sachi suggested

{
    "spleen_ct_segmentation": {
        "version": "v0.1.0",
        "checksum": "18bae0a62d90fb06b33c1c7702867a4d4d7b98f0",
        "source": "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/spleen_ct_segmentation_v0.1.0.zip"
    },
    "pancreas_ct_dints_segmentation": {
        "version": "v0.1.0",
        "checksum": "003926cdb603a97945012eec6aeeb65020942a65",
        "source": "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/pancreas_ct_dints_segmentation_v0.1.0.zip"
    },
    "brats_mri_segmentation": {
        "version": "v0.1.0",
        "checksum": "5da956c7425bf02f31e3a2cdf28693431fa76212",
        "source": "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/brats_mri_segmentation_v0.1.0.zip"
    },
    "spleen_deepedit_annotation": {
        "version": "v0.1.0",
        "checksum": "74c61a6dec808891acb9ec1df4e8305008409c97",
        "source": "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/spleen_deepedit_annotation_v0.1.0.zip"
    },
    "swin_unetr_btcv_segmentation": {
        "version": "v0.1.0",
        "checksum": "65eaf535f04e80c8d82c597449a60ef8c1622692",
        "source": "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/swin_unetr_btcv_segmentation_v0.1.0.zip"
    },
    "ventricular_short_axis_3label": {
        "version": "v0.1.0",
        "checksum": "8cd1adb99e00fde9adb99f8abadcc8448c1cdf9b",
        "source": "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/ventricular_short_axis_3label_v0.1.0.zip"
    }
}

The bundle app main.py needs a simple modification for downloading the correct version bundle ZIP files.

bundle_version = zoo_info[k]["version"]
k = k+'_'+bundle_version
download(name=k, bundle_dir=self.model_dir, source=zoo_source, repo=zoo_repo)

Then the bundle integration tests (test_infer.py and test_trainer.py) can be for example:

    def test_segmentation_spleen(self):
        if not torch.cuda.is_available():
            return

        model = "spleen_ct_segmentation"
        image = "spleen_8"

        response = requests.post(f"{SERVER_URI}/infer/{model}?image={image}")
        assert response.status_code == 200
    def test_segmentation_spleen_trainer(self):
        if not torch.cuda.is_available():
            return

        params = {
            "model": "spleen_ct_segmentation",
            "max_epochs": 1,
            "name": "net_test_spleen_trainer_01",
            "val_split": 0.5,
            "multi_gpu": False,
        }

        response = requests.post(f"{SERVER_URI}/train/spleen_ct_segmentation?run_sync=True", json=params)
        assert response.status_code == 200

If this is good, I will create the modification to above scripts and commit. @SachidanandAlle @Nic-Ma Thanks!

yiheng-wang-nv commented 2 years ago

Do you mean

    "spleen_ct_segmentation": {
        {
            "version": "v0.1.0",
            ...
        },
        {
            "version": "v0.2.0",
            ...
        },
        ...
    },
tangy5 commented 2 years ago

Do you mean

    "spleen_ct_segmentation": {
        {
            "version": "v0.1.0",
            ...
        },
        {
            "version": "v0.2.0",
            ...
        },
        ...
    },

This can also be added, if there are more than one version of bundle files corresponding to same bundle name, users have the option to choose the version.

Nic-Ma commented 2 years ago

Hi @yiheng-wang-nv ,

Your example is not correct, it should a list of dicts, not dict of dicts. And actually, I think the format of model_info.json doesn't affect the user experience of downloading a bundle, right? If really want to let users download model_infor.json, we should maintain this file in the releases instead of source code. The benefit is that we don't need to submit a separate PR to update model_info.json, the bad thing is that we can't track the file change any more. @wyli What do you think?

Thanks.

yiheng-wang-nv commented 2 years ago

Hi @Nic-Ma , I do not understand here why we need a list of dicts, since we need a dict to ensure that all keys (bundle name) are identical, and to quickly find a target bundle, using list may not be suitable.

Nic-Ma commented 2 years ago

Hi @yiheng-wang-nv ,

I think if you use dict of dicts, should use the version number as the key? For example:

"spleen_ct_segmentation": {
    "v0.1.0": {
        ...
    },
    "v0.2.0": {
        ...
    },
    ...
    },

And actually, my main question is: why users will care about this model_info.json file? I think it's only used by the CICD?

Thanks in advance.

SachidanandAlle commented 2 years ago

I suggest not to think of version management through this file.. lets keep it simple about this json file.. let this file represent only the current status of the model zoo.. lets not add too many things..

yiheng-wang-nv commented 2 years ago

I suggest not to think of version management through this file.. lets keep it simple about this json file.. let this file represent only the current status of the model zoo.. lets not add too many things..

Hi @SachidanandAlle , the initial purpose of this file is to track the changes of models. According to your suggestion, what do you think of adding a new file in release, which only contains the latest version of each bundle. You can use that one for further usage.

SachidanandAlle commented 2 years ago

I am fine with another file.. but what's more helpful is to get the current (latest) list of models available in the zoo including their path info etc.. (similar to dir command) without actually downloading/traversing them

Nic-Ma commented 2 years ago

Hi @wyli ,

We are discussing to put the model_info.json in the releases instead of the source code, for other platforms or users to download as the "manifest" of bundles in the zoo, so it should contain information of all bundles (not only the latest version). And we don't need to submit a PR anymore every time updated it. The risk may be: lack of tracking of the change in the file? What do you think?

Thanks.

wyli commented 2 years ago

yes I think it's good to maintain a model_info.json in this repo, we should keep all the relevant info in this git repository as much as possible to track any changes.

maybe this feature request is more about providing public APIs for the users to query available models and versions, it should be discussed separately...

Nic-Ma commented 2 years ago

Hi @wyli ,

I agree, we should provide the API to query all the available bundles in the model-zoo and maybe also all the versions of a specific bundle. Refer to the download API: https://github.com/Project-MONAI/MONAI/blob/dev/monai/bundle/scripts.py#L177

Thanks.

yiheng-wang-nv commented 2 years ago

This ticket is addressed in: https://github.com/Project-MONAI/MONAI/pull/4994