allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.61k stars 651 forks source link

Error with path_substitutions and model uploads/ verify_upload inconsistent #1028

Open john-zielke-snkeos opened 1 year ago

john-zielke-snkeos commented 1 year ago

Describe the bug

When uploading models to clearml, the saved url to clearml is the substituted url and not the originally configured url. This is not the expected behaviour as it prevents downloading the same artifacts from the UI or other agents that rely on a different path substitution. It seems that this happens because of the way StorageHelper.verify_upload() works. I see two issues here:

  1. The current setup uses verify_upload in multiple places to check the upload and then replaces the uploaded path with the path returned by verify_upload. This path includes the replacements. IMO, this should not be the case.
  2. According to the docstring of verify_upload, it returns a boolean, but instead it returns a path.
    def verify_upload(self, folder_uri='', raise_on_error=True, log_on_error=True):
        """
        Verify that this helper can upload files to a folder.

        An upload is possible iff:
            1. the destination folder is under the base uri of the url used to create the helper
            3. the helper has credentials to write to the destination folder

        :param folder_uri: The destination folder to test. Must be an absolute
            url that begins with the base uri of the url used to create the helper.
        :param raise_on_error: Raise an exception if an upload is not possible
        :param log_on_error: Log an error if an upload is not possible
        :return: True, if, and only if, an upload to folder_uri is possible. #! type is bool
        """

        folder_uri = self._canonize_url(folder_uri)

        folder_uri = self.conform_url(folder_uri, self._base_url)

        test_path = self._normalize_object_name(folder_uri)

        ...

        return folder_uri #! type is string

To reproduce

Upload a model with path substitutions in place.

Expected behaviour

It should log the original name without the substitution in ClearML

Environment

ainoam commented 1 year ago

Thanks for bringing this to our attention @john-zielke-snkeos.

We'll look into it.