microsoft / semantic-link-labs

Early access to new features for Microsoft Fabric's Semantic Link.
MIT License
110 stars 20 forks source link

run_model_bpa_bulk error #110

Closed Koushikrishnan closed 4 days ago

Koushikrishnan commented 2 weeks ago

Describe the bug Hello, I have an issue with using 'run_model_bpa_bulk'. We have a bunch of different workspaces and datasets and would like to run the bpa rules across all of them. Some of the rules fine without issues. But many of them fail when trying to access the TOM for the specific individual dataset.

This is my code:

image

This works fine (note that there is no "tom" reference here)

image

But when I want to use a rule using "tom" reference, I use the "connect_semantic_model" library reference. Here is an example:

image

This is the error:

image

I've tried few other ways like using the 'run_model_bpa' instead of the '_bulk' which actually works for the "tom" reference rules. Also all non "tom" referred rules work fine with bulk. Its just an issue when "run_model_bpa_bulk" and "tom" reference rules are together, I can't find a way around this.

Could you please help me on this and explain what am doing wrong here? Thank you for your assistance and for this excellent source for helping us making our code better!

m-kovalsky commented 2 weeks ago

In line 12, when you call the connect_semantic_model function, you are calling: dataset=dataset, workspace=workspace. Have you defined those parameters? I don't see them in the script. You can specify any model in any workspace for that - it doesn't matter which one. Also, you need to uncomment the import on line 8 (only need to run it once per session).

Koushikrishnan commented 1 week ago

Hello @m-kovalsky thanks for the response. I have defined the dataset and workspace parameters, yes. My first screenshot on the ticket has those, its on different cells. Its just that for all rules that use "tom" (the lower case one), the "bulk" method doesn't seem to run. Gotcha, about the line 8 - uncomment. Will remember to run it once per session.

This is my full code cell now (included dataset and workspace in same cell). Still the same error though. Could you help me identify what am doing wrong here?

image

The error (same as before)

image

m-kovalsky commented 1 week ago

In line 15 of your code, dataset and workspace are string parameters. You need to enter a single dataset and workspace name. For example:

dataset =‘AdvWorks’, workspace=‘My new workspace’


From: Venkatakrishnan Narayanan @.> Sent: Tuesday, September 3, 2024 7:34:52 PM To: microsoft/semantic-link-labs @.> Cc: Michael Kovalsky @.>; Mention @.> Subject: Re: [microsoft/semantic-link-labs] run_model_bpa_bulk error (Issue #110)

Hello @m-kovalskyhttps://github.com/m-kovalsky thanks for the response. I have defined the dataset and workspace parameters, yes. My first screenshot on the ticket has those, its on different cells. Its just that for all rules that use "tom" (the lower case one), the "bulk" method doesn't seem to run. Gotcha, about the line 8 - uncomment. Will remember to run it once per session.

This is my full code cell now (included dataset and workspace in same cell). Still the same error though. Could you help me identify what am doing wrong here?

image.png (view on web)https://github.com/user-attachments/assets/713fcee8-0843-4baa-88c4-43d22a1db414

The error (same as before)

image.png (view on web)https://github.com/user-attachments/assets/7bb59508-b556-4b25-89ad-7692b0dd96df

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/semantic-link-labs/issues/110#issuecomment-2326965581, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHBQBNSAFDLGPIYMNPUNUC3ZUXQKZAVCNFSM6AAAAABNNEXLOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRWHE3DKNJYGE. You are receiving this because you were mentioned.Message ID: @.***>

Koushikrishnan commented 1 week ago

Got it. But how do we use bulk run then, if in case we need scan all datasets in a single workspace (or across all workspaces)?

Koushikrishnan commented 1 week ago

Hello @m-kovalsky, any update on the question above? we were able to run almost every rule apart from the "tom" ones for bulk run. Thank you!

m-kovalsky commented 1 week ago

The dataset/workspace used for initiating the rules does not matter. It can be any dataset/workspace. It is not relevant when actually running the BPA. It is simply used to initiate the rules. When run_model_bpa/bulk is executed, it runs against the relevant semantic model(s), not the one specified for the rules.

Koushikrishnan commented 1 week ago

@m-kovalsky thank you for clearing that! That works now. One other question though for this rule below. There seems to be something wrong here.

image

I've tried to debug this and identified that something is wrong with line 21 and the statement "tom.is_calculated_table(table_name=obj.Name)". If I remove it and run the remaining it runs fine.

image

Same thing happens for "tom.is_field_parameter(table_name=obj.Name)" statement as well. Could you help pointing out the error here?

m-kovalsky commented 1 week ago

I found the issue and made a fix. This will be available in the next release. You will no longer need to enter a dataset/workspace. Use this as a template in the next version of semantic link labs.

import sempy
sempy.fabric._client._utils._init_analysis_services()
import Microsoft.AnalysisServices.Tabular as TOM
from sempy_labs.tom import connect_semantic_model
import pandas as pd

workspace_name = ''

rules = pd.DataFrame(
    [
        (
            "Performance",
            "Table",
            "Warning",
            "Rule name...",
            lambda obj, tom: tom.is_calculated_table(table_name=obj.Name),
            'Rule description...',
            '',
        )
    ],
    columns=[
            "Category",
            "Scope",
            "Severity",
            "Rule Name",
            "Expression",
            "Description",
            "URL",
        ],
)

labs.run_model_bpa_bulk(workspace=workspace_name, rules=rules)
Koushikrishnan commented 1 week ago

I found the issue and made a fix. This will be available in the next release. You will no longer need to enter a dataset/workspace. Use this as a template in the next version of semantic link labs.

import sempy
sempy.fabric._client._utils._init_analysis_services()
import Microsoft.AnalysisServices.Tabular as TOM
from sempy_labs.tom import connect_semantic_model
import pandas as pd

workspace_name = ''

rules = pd.DataFrame(
    [
        (
            "Performance",
            "Table",
            "Warning",
            "Rule name...",
            lambda obj, tom: tom.is_calculated_table(table_name=obj.Name),
            'Rule description...',
            '',
        )
    ],
    columns=[
            "Category",
            "Scope",
            "Severity",
            "Rule Name",
            "Expression",
            "Description",
            "URL",
        ],
)

labs.run_model_bpa_bulk(workspace=workspace_name, rules=rules)

Sounds great, thank you! Any tentative date when the next release will be available?

Koushikrishnan commented 1 week ago

@m-kovalsky Also on a side note, just started noticing this new error today. I dont remember seeing this earlier. Hopefully this will disappear after the next release.

image

m-kovalsky commented 4 days ago

0.7.3 is now available. See the example in the Model Optimization notebook for running BPA using custom rules.