microsoft / semantic-link-labs

Early access to new features for Microsoft Fabric's Semantic Link.
MIT License
178 stars 37 forks source link

Add OData filter to list_workspaces function #204

Closed AndreaTodaro closed 3 weeks ago

AndreaTodaro commented 1 month ago

Updated list_workspaces to incorporate the filter parameter

import re
import pandas as pd
from typing import Optional

def list_workspaces(
    top: Optional[int] = 5000, 
    skip: Optional[int] = None,
    filter: Optional[str] = None
) -> pd.DataFrame:
    """
    Lists workspaces for the organization. This function is the admin version of list_workspaces.

    Parameters
    ----------
    top : int, default=5000
        Returns only the first n results. This parameter is mandatory and must be in the range of 1-5000.
    skip : int, default=None
        Skips the first n results. Use with top to fetch results beyond the first 5000.
    filter : str, default=None
        A filter string to apply to the query. Should be in OData filter syntax.

    Returns
    -------
    pandas.DataFrame
        A pandas DataFrame showing a list of workspaces for the organization.
    """

    df = pd.DataFrame(
        columns=[
            "Id",
            "Is Read Only",
            "Is On Dedicated Capacity",
            "Type",
            "Name",
            "Capacity Id",
            "Default Dataset Storage Format",
            "Pipeline Id",
            "Has Workspace Level Settings",
        ]
    )

    url = f"/v1.0/myorg/admin/groups?$top={top}"
    if skip is not None:
        url = f"{url}&$skip={skip}"
    if filter is not None:
        url = f"{url}&$filter={filter}"

    client = fabric.PowerBIRestClient()
    response = client.get(url)

    if response.status_code != 200:
        raise FabricHTTPException(response)

    for v in response.json().get("value", []):
        capacity_id = v.get("capacityId")
        if capacity_id:
            capacity_id = capacity_id.lower()
        new_data = {
            "Id": v.get("id"),
            "Is Read Only": v.get("isReadOnly"),
            "Is On Dedicated Capacity": v.get("isOnDedicatedCapacity"),
            "Capacity Id": capacity_id,
            "Default Dataset Storage Format": v.get("defaultDatasetStorageFormat"),
            "Type": v.get("type"),
            "Name": v.get("name"),
            "State": v.get("state"),
            "Pipeline Id": v.get("pipelineId"),
            "Has Workspace Level Settings": v.get("hasWorkspaceLevelSettings"),
        }
        df = pd.concat([df, pd.DataFrame(new_data, index=[0])], ignore_index=True)

    bool_cols = [
        "Is Read Only",
        "Is On Dedicated Capacity",
        "Has Workspace Level Settings",
    ]
    df[bool_cols] = df[bool_cols].astype(bool)

    return df

Usage: You can call the function with a filter in OData syntax like this: df = list_workspaces(top=1000, filter="(startswith(name,'DEV_') or startswith(name,'TST_') or startswith(name,'PRD_'))")

m-kovalsky commented 1 month ago

This parameter has been added to the function in PR #207. Will be available in the next release.

m-kovalsky commented 3 weeks ago

See 0.8.4.