tableau / server-client-python

A Python library for the Tableau Server REST API
https://tableau.github.io/server-client-python/
MIT License
656 stars 420 forks source link

Type 1: Ability to get items via the path #1317

Open VDFaller opened 10 months ago

VDFaller commented 10 months ago

Summary

Right now, to my knowledge, we have to get items via the filter. I'd like to just get_project("/path/to/project")

Description

I have my own created already but figured I'd make sure this is something that might be of value for others. It might need some clean up in order to fit y'alls standards but like I do something like this.

class myTSC(TSC.Server):
    """A subclass of the tableau server client that adds some functionality"""
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def get_project(self, path:Union[PurePath, str], parent_id=None, has_cxn:bool=False)->TSC.ProjectItem:
        """ gets a project via a path
        ARGS:
            path - can either be a pathlib.PurePath or a string of the fully scoped path to the project
                '/A/B/C' (with or without the root is fine, it will assume root if parent_id is not given)
            parent_id - if you already know the parent id, you can pass it in
            has_cxn - if you already have a connection (used by the wrapper)
        RETURNS:
            TSC.ProjectItem - so to get the project id you'd do project.id

        EXAMPLE:
            > get_project('/path/to/my/project')

            > get_project(Path('/path/to/my/project'))
        """
        assert isinstance(path, (PurePath, str)), 'path must be Path or string'

        parts = Path(path).parts
        # we should warn
        if parent_id is None:
            if not (str(path).startswith('/') or str(path).startswith('\\')):
                warn(f"When 'parent_id' is None, 'path' should start with '/' for '{path}'")
        elif str(path).startswith('/') or str(path).startswith('\\'):
            raise ValueError('parent_id was given but path looks to be a root')
        else:
            assert isinstance(parent_id, str), "parent_id must be None or string"

        if parts[0] in  ('\\', '/'): # let people put in a root if they want
            parts = parts[1:]

        for pp in parts: ## just for through the path parts and go deeper and deeper and check that it exists
            project_found = False
            for p in TSC.Pager(self.projects):
                if p.name == pp and p.parent_id == parent_id:
                    project = p
                    parent_id = project.id # setting the parent ID for the next project level
                    project_found = True
                    logger.debug(f"Found Project {pp}")
                    break
            if not project_found: # fail if it doesn't find one
                raise EndpointUnavailableError('Project "'+'/'.join(parts[:parts.index(pp)+1])+'" Not Found')

        return project

new_server = myTSC(server.server_address, use_server_version=True)
with new_server.auth.sign_in(tableau_auth):
    project = new_server.get_project(Path('/path/to/my/project'))

Use case

I often have to pull very specific data sources and manipulate them in python. Then push them back up to tableau somewhere else.
This makes the configs for those projects much easier. If there's an easier way to do this I'd also be interested in that.

jorwoods commented 10 months ago

@VDFaller interesting idea. Is the path you’re referring to a local path, or a tableau server url path?

VDFaller commented 10 months ago

@jorwoods Sorry I missed this. I mean the path to the workbook or whatever via explore.
So project/subproject/subproject/workbook/view because that's how people that don't know the api know how to get to their items. I also use this to know where to publish workbooks via the actual local path. So a little of both?

jorwoods commented 10 months ago

Ok, I got confused reading the code because my understanding is that pathlib.Path objects are for local file paths. So it sounds like you want to take the project structure from Tableau Server, and determine a local file path to download the files to. (Or conversely, you have the items locally in that folder structure, and want to determine where on the site to publish them to.) The IDs that you see in the URL are not necessarily the IDs or names from the REST API. The crosswalk that I know of means using the metadata API. But as you noticed, that is not necessarily a trivial problem.

Project names are also not guaranteed to be unique to a site. Only unique within their level of hierarchy. So filtering by name on projects can get precarious.

In my own applications, I have implemented logic to check if it is a URL, pass it through urlparse, use a bit of regex with named capture groups to construct a graphQL query based on what I found.

Something like

import re
from urllib.parse import urlparse

def url_to_parts(url: str) -> dict[str, str] | None:
    pattern = r"\/site\/(?P<site_name>.*?)\/(?P<object_type>.*?)s\/(?P<viz_portal_id>.*)"
    if (match := re.match(pattern, urlparse(url).fragment)):
        return match.groupdict()

Projects get a little weird as well because they don't have a top level representation in the metadata repo. You have to query for an item that exists within them. Which means you would also

In my contributions to TSC, I have always left them as the more pure implementation of the REST API logic and translation and not necessarily any logic of what to do with that API.

VDFaller commented 10 months ago

Not exactly. I'm using pathlib because this is pathlike, which pathlib does great with. I showed get project because that's the one that is most difficult. Once I get the project I just filter by name then check if the parent project id is correct when getting workbook or view or whatever. It's not great but using the URL as you said isn't full proof.

I'm really querying the server for an object that exists at /path/to/project/workbook/view or something similar.

@jorwoods, does this help clarify? I can add more info to my desc if that's helpful. I'm also happy to put in a merge request.