DataBiosphere / terra-notebook-utils

Utilities for the Terra notebook environment.
MIT License
7 stars 6 forks source link

Recent change broke TNU DRS copy_batch API backward compatibility and makes copy_batch harder to use #370

Closed mbaumann-broad closed 2 years ago

mbaumann-broad commented 2 years ago

Until recently, the TNU DRS API copy_batch signature took a simple list of DRS URIs to copy to a given destination, as follows:

def copy_batch(drs_urls: Iterable[str],
               dst_pfx: str,
               workspace_name: Optional[str]=WORKSPACE_NAME,
               workspace_namespace: Optional[str]=WORKSPACE_GOOGLE_PROJECT):

and now it takes a list of dicts (e.g., JSON objects) with each dict/object having a DRS URI and a specific destination for that DRS URI. This is great in that it has more expressive power/control, yet it also breaks backward compatibility and is harder to use for users for common use cases

manifest_schema = {
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "drs_uri": {"type": "string"},
            "dst": {"type": "string"},
        },
        "required": ["drs_uri", "dst"],
    },
}

def copy_batch(manifest: List[Dict[str, str]],
               workspace_name: Optional[str]=WORKSPACE_NAME,
               workspace_namespace: Optional[str]=WORKSPACE_GOOGLE_PROJECT):

Many Terra users don't have strong software skills, at least in Python (many are R language developers, and may be using TNU within R, etc.). Creating a list of dicts with specific fields simply to download a list of DRS URIs to a single, given destination seems too much to ask.

This was changed in TNU commit:
Use manifest for batch copy, introduce CLI arg #335.

I expect it is both possible and easy to change the TNU DRS API for copy_batch to support both the previous simple format and the new manifest format. That is the desired resolution.

mbaumann-broad commented 2 years ago

This was fixed in: https://github.com/DataBiosphere/terra-notebook-utils/pull/374

mbaumann-broad commented 2 years ago

This was fixed in: https://github.com/DataBiosphere/terra-notebook-utils/pull/374