microsoft / semantic-link-labs

Early access to new features for Microsoft Fabric's Semantic Link.
MIT License
178 stars 37 forks source link

run_model_bpa Running Forever and Not Completing for Certain Datasets #274

Closed Jai-Prakash-HU closed 1 week ago

Jai-Prakash-HU commented 1 week ago

run_model_bpa is running from last many hours. Please check the scenarios as below.

  1. Tried multiple times, this particular dataset never gets completed.
  2. Tried with both function run_model_bpa and run_model_bpa_bulk
  3. published dadatset to diffrent workspace, same result
  4. Dataset is very simple, It's size is 2 MB. Large semantic model storage format is disabled. Sensitivity is internal. Total number of tables are 2 and few calculated tables.
  5. I tried 7-8 times on different days same result.
  6. It doesn't throw any error, just keeps running.
  7. Same function in the same connection instance running fine.
  8. I can download dataset, it looks fine. I published it to different workspace with different name, but same issue.

Image

Image

m-kovalsky commented 1 week ago

Is the model processed? Can you run labs.get_model_calc_dependencies() against that semantic model? Can you connect using the Tom wrapper:

with connect_semantic_model(dataset='', workspace='') as tom:


From: Jai-Prakash-HU @.> Sent: Tuesday, November 12, 2024 7:43:13 PM To: microsoft/semantic-link-labs @.> Cc: Subscribed @.***> Subject: [microsoft/semantic-link-labs] run_model_bpa running too much time for few dataset. (Issue #274)

run_model_bpa is running from last many hours. Please check the scenarios as below.

  1. Tried multiple times, this particular dataset never gets completed.
  2. Tried with both function run_model_bpa and run_model_bpa_bulk
  3. published dadatset to diffrent workspace, same result
  4. Dataset is very simple, It's size is 2 MB. Large semantic model storage format is disabled. Sensitivity is internal. Total number of tables are 2 and few calculated tables.
  5. I tried 7-8 times on different days same result.
  6. It doesn't throw any error, just keeps running.
  7. Same function in the same connection instance running fine.
  8. I can download dataset, it looks fine. I published it to different workspace with different name, but same issue.

image.png (view on web)https://github.com/user-attachments/assets/e18bcb42-577d-48be-ab48-85bd04da114d

image.png (view on web)https://github.com/user-attachments/assets/19d8f45e-f03e-4d72-a192-e4028e414402

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/semantic-link-labs/issues/274, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHBQBNWKSMYLYHBHVU64JYT2AI43DAVCNFSM6AAAAABRUTSHMKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY2TEOBZG42DQNI. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Jai-Prakash-HU commented 1 week ago

I think there is some problem with labs.get_model_calc_dependencies function. It never returns anything for the dataset for which run_model_bpa never return anything. get_model_calc_dependencies also keeps running.

is it like run_model_bpa is dependent on get_model_calc_dependencies in some scenario? and in those scenarios run_model_bpa is running for forever?

Image

I made a changes and deleted a table and then get_model_calc_dependencies gave result within a few seconds.

the table which I deleted was just a dummy table for measure. and it's query was as below.

let Quelle = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i44FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Spalte ""1""" = _t]),

"Geänderter Typ" = Table.TransformColumnTypes(Quelle,{{"Spalte ""1""", type text}}),

#"Entfernte Spalten" = Table.RemoveColumns(#"Geänderter Typ",{"Spalte ""1"""})

in

"Entfernte Spalten"

m-kovalsky commented 1 week ago

Yes, run_model_bpa is dependent on get_model_calc_dependencies. Every time you call run_model_bpa, get_model_calc_dependencies is run in the background in order to satisfy specific rules. Can you run this DMV against that model in DAX studio (against the original model)?

SELECT * FROM $SYSTEM.DISCOVER_CALC_DEPENDENCY

Jai-Prakash-HU commented 1 week ago

Update: earlier it was running forever. Now it is throwing error.

  1. SELECT * FROM $SYSTEM.DISCOVER_CALC_DEPENDENCY: It is working fine. It gives result within seconds
  2. labs.get_model_calc_dependencies(dataset = 'XXXXXXXXXXXXXXXXX', workspace = 'XXXXXXXXXXXXXXXX') : Working fine It gives result within seconds
  3. labs.run_model_bpa(dataset = 'XXXXXXXXXXXXXXXXXXXXX', workspace = 'XXXXXXXXXXXXXXXXXXXX') : It is throwing error.

please check the error below:

816 "Naming Conventions", 817 ["Table", "Column", "Measure", "Partition", "Hierarchy"], 818 "Warning", 819 "Object names must not contain special characters", 820 lambda obj, tom: re.search(r"[\t\r\n]", obj.Name), 821 "Object names should not include tabs, line breaks, etc.", 822 ), 823 ], 824 columns=[ 825 "Category", 826 "Scope", 827 "Severity", 828 "Rule Name", 829 "Expression", 830 "Description", 831 "URL", 832 ], 833 ) 835 return rules

File ~/cluster-env/clonedenv/lib/python3.11/site-packages/sempy_labs/tom/_model.py:2281, in TOMWrapper.is_field_parameter(self, table_name) 2276 import Microsoft.AnalysisServices.Tabular as TOM 2278 t = self.model.Tables[table_name] 2280 return ( -> 2281 self.is_field_parameter(table_name=table_name) 2282 and t.Columns.Count == 4 2283 and any( 2284 "NAMEOF(" in p.Source.Expression.replace(" ", "") for p in t.Partitions 2285 ) 2286 and all( 2287 "[Value" in c.SourceColumn 2288 for c in t.Columns 2289 if c.Type == TOM.ColumnType.Data 2290 ) 2291 and any( 2292 ep.Name == "ParameterMetadata" 2293 for c in t.Columns 2294 for ep in c.ExtendedProperties 2295 ) 2296 )

File ~/cluster-env/clonedenv/lib/python3.11/site-packages/sempy_labs/tom/_model.py:2281, in TOMWrapper.is_field_parameter(self, table_name) 2276 import Microsoft.AnalysisServices.Tabular as TOM 2278 t = self.model.Tables[table_name] 2280 return ( -> 2281 self.is_field_parameter(table_name=table_name) 2282 and t.Columns.Count == 4 2283 and any( 2284 "NAMEOF(" in p.Source.Expression.replace(" ", "") for p in t.Partitions 2285 ) 2286 and all( 2287 "[Value" in c.SourceColumn 2288 for c in t.Columns 2289 if c.Type == TOM.ColumnType.Data 2290 ) 2291 and any( 2292 ep.Name == "ParameterMetadata" 2293 for c in t.Columns 2294 for ep in c.ExtendedProperties 2295 ) 2296 )

[... skipping similar frames: TOMWrapper.is_field_parameter at line 2281 (2967 times)]

File ~/cluster-env/clonedenv/lib/python3.11/site-packages/sempy_labs/tom/_model.py:2281, in TOMWrapper.is_field_parameter(self, table_name) 2276 import Microsoft.AnalysisServices.Tabular as TOM 2278 t = self.model.Tables[table_name] 2280 return ( -> 2281 self.is_field_parameter(table_name=table_name) 2282 and t.Columns.Count == 4 2283 and any( 2284 "NAMEOF(" in p.Source.Expression.replace(" ", "") for p in t.Partitions 2285 ) 2286 and all( 2287 "[Value" in c.SourceColumn 2288 for c in t.Columns 2289 if c.Type == TOM.ColumnType.Data 2290 ) 2291 and any( 2292 ep.Name == "ParameterMetadata" 2293 for c in t.Columns 2294 for ep in c.ExtendedProperties 2295 ) 2296 )

RecursionError: maximum recursion depth exceeded

m-kovalsky commented 1 week ago

Ah, my mistake. I will fix this in a quick release. Thanks for raising the issue.

m-kovalsky commented 1 week ago

Fixed in 0.8.6.