microsoft / semantic-link-labs

Early access to new features for Microsoft Fabric's Semantic Link.
MIT License
129 stars 21 forks source link

Error getting extended information for lakehouse tables #86

Open MASFelixPBI opened 1 month ago

MASFelixPBI commented 1 month ago

What are you trying to achieve? Trying to get the lakehouse tables extended information to get files, and table size information

What have you tried so far? Running the following code get the error below:

labs.lakehouse.get_lakehouse_tables (lakehouse=None, workspace=None, extended=True) Without the extended I'm able to get the table information.

IndexError Traceback (most recent call last) Cell In[16], line 1 ----> 1 labs.lakehouse.get_lakehouse_tables (lakehouse=None, workspace=None, extended=True)

File /nfs4/pyenv-f3d326fd-abb6-461d-9a99-e294855d2925/lib/python3.10/site-packages/sempy/_utils/_log.py:273, in mds_log..get_wrapper..log_decorator_wrapper(*args, *kwargs) 270 raise 272 try: --> 273 result = func(args, **kwargs) 275 # The invocation for get_message_dict moves after the function 276 # so it can access the state after the method call 277 message.update(extractor.get_completion_message_dict(result, arg_dict))

File /nfs4/pyenv-f3d326fd-abb6-461d-9a99-e294855d2925/lib/python3.10/site-packages/sempy_labs/lakehouse/_get_lakehouse_tables.py:97, in get_lakehouse_tables(lakehouse, workspace, extended, count_rows, export) 95 df = pd.concat([df, pd.DataFrame(new_data, index=[0])], ignore_index=True) 96 else: ---> 97 sku_value = get_sku_size(workspace) 98 guardrail = get_directlake_guardrails_for_sku(sku_value) 100 spark = SparkSession.builder.getOrCreate()

File /nfs4/pyenv-f3d326fd-abb6-461d-9a99-e294855d2925/lib/python3.10/site-packages/sempy_labs/directlake/_guardrails.py:58, in get_sku_size(workspace) 51 dfC.rename(columns={"Id": "Capacity Id"}, inplace=True) 52 dfCW = pd.merge( 53 dfW, 54 dfC[["Capacity Id", "Sku", "Region", "State"]], 55 on="Capacity Id", 56 how="inner", 57 ) ---> 58 sku_value = dfCW.loc[dfCW["Name"] == workspace, "Sku"].iloc[0] 60 return sku_value

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/core/indexing.py:1103, in _LocationIndexer.getitem(self, key) 1100 axis = self.axis or 0 1102 maybe_callable = com.apply_if_callable(key, self.obj) -> 1103 return self._getitem_axis(maybe_callable, axis=axis)

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/core/indexing.py:1656, in _iLocIndexer._getitem_axis(self, key, axis) 1653 raise TypeError("Cannot index by location index with a non-integer key") 1655 # validate the location -> 1656 self._validate_integer(key, axis) 1658 return self.obj._ixs(key, axis=axis)

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/core/indexing.py:1589, in _iLocIndexer._validate_integer(self, key, axis) 1587 len_axis = len(self.obj._get_axis(axis)) 1588 if key >= len_axis or key < -len_axis: -> 1589 raise IndexError("single positional indexer is out-of-bounds")

IndexError: single positional indexer is out-of-bounds

m-kovalsky commented 2 weeks ago

Do you still get this issue in 0.7.3?