Closed nisharncslabs closed 3 years ago
Can you post your complete code example that fails?
Sure, so I have 2 files in my editor (ssas_api.py copy exactly same as in repo)
These are both shown down below.
My main question is: I am not too sure what additional things I need to do along with having these files in order to run/get the DAX query data successfully from a particular AAS server.
root
.import System
command and subsequent ones below are greyed out as if they arent found. Do I need to pip install
something inorder for these to work?File 1: QuickStart
import ssas_api
import clr
'''
# Python Snippet showing where the required dll files are located
base = "C:/Program Files/PackageManagement/NuGet/Packages/Microsoft.AnalysisServices"
_version = "19.20.1.0" # at time of this writing
AMO_PATH = f"{base}.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.Tabular.dll"
ADOMD_PATH = f"{base}.AdomdClient.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.AdomdClient.dll"
'''
#_load_assemblies(amo_path=amo_path, adomd_path=adomd_path)
conn = ssas_api.set_conn_string(
server='<aas_servername>',
db_name='<db_name>',
username='<email>',
password='<password>'
)
dax_string = '''
MY DAX QUERY
'''
df = ssas_api.get_DAX(connection_string=conn, dax_string=dax_string)
File 2: ssas_api.py script for AAS connection
# -*- coding: utf-8 -*-
"""
Created on Wed Sep 20 16:59:43 2017
@author: Yehoshua
"""
import pandas as pd
import numpy as np
from functools import wraps
from pathlib import Path
import logging
import warnings
logger = logging.getLogger(__name__)
try:
import clr # name for pythonnet
except ImportError:
msg = """
Could not import 'clr', install the 'pythonnet' library.
For conda, `conda install -c pythonnet pythonnet`
"""
raise ImportError(msg)
def _load_assemblies(amo_path=None, adomd_path=None):
"""
Loads required assemblies, called after function definition.
Might need to install SSAS client libraries:
https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-data-providers
Parameters
----------
amo_path : str, default None
The full path to the DLL file of the assembly for AMO.
Should end with '**Microsoft.AnalysisServices.Tabular.dll**'
Example: C:/my/path/to/Microsoft.AnalysisServices.Tabular.dll
If None, will use the default location on Windows.
adomd_path : str, default None
The full path to the DLL file of the assembly for ADOMD.
Should end with '**Microsoft.AnalysisServices.AdomdClient.dll**'
Example: C:/my/path/to/Microsoft.AnalysisServices.AdomdClient.dll
If None, will use the default location on Windows.
"""
# Full path of .dll files
root = Path(r"C:\Windows\Microsoft.NET\assembly\GAC_MSIL")
# get latest version of libraries if multiple libraries are installed (max func)
if amo_path is None:
amo_path = str(
max((root / "Microsoft.AnalysisServices.Tabular").iterdir())
/ "Microsoft.AnalysisServices.Tabular.dll"
)
if adomd_path is None:
adomd_path = str(
max((root / "Microsoft.AnalysisServices.AdomdClient").iterdir())
/ "Microsoft.AnalysisServices.AdomdClient.dll"
)
# load .Net assemblies
logger.info("Loading .Net assemblies...")
clr.AddReference("System")
clr.AddReference("System.Data")
clr.AddReference(amo_path)
clr.AddReference(adomd_path)
# Only after loaded .Net assemblies
global System, DataTable, AMO, ADOMD
**import System
from System.Data import DataTable
import Microsoft.AnalysisServices.Tabular as AMO
import Microsoft.AnalysisServices.AdomdClient as ADOMD**
logger.info("Successfully loaded these .Net assemblies: ")
for a in clr.ListAssemblies(True):
logger.info(a.split(",")[ 0 ])
def _assert_dotnet_loaded(func):
"""
Wrapper to make sure that required .NET assemblies have been loaded and imported.
Can pass the keyword arguments 'amo_path' and 'adomd_path' to any annotated function,
it will use them in the `_load_assemblies` function.
Example:
.. code-block:: python
import ssas_api
conn = ssas_api.set_conn_string(
's', 'd', 'u', 'p',
amo_path='C:/path/number/one',
adomd_path='C:/path/number/two'
)
"""
@wraps(func)
def wrapper(*args, **kwargs):
amo_path = kwargs.pop("amo_path", None)
adomd_path = kwargs.pop("adomd_path", None)
try:
type(DataTable)
except NameError:
# .NET assemblies not loaded/imported
logger.warning(".Net assemblies not loaded and imported, doing so now...")
_load_assemblies(amo_path=amo_path, adomd_path=adomd_path)
return func(*args, **kwargs)
return wrapper
@_assert_dotnet_loaded
def set_conn_string(server, db_name, username, password):
"""
Sets connection string to SSAS database,
in this case designed for Azure Analysis Services
"""
conn_string = (
"Provider=MSOLAP;Data Source={};Initial Catalog={};User ID={};"
"Password={};Persist Security Info=True;Impersonation Level=Impersonate".format(
server, db_name, username, password
)
)
return conn_string
@_assert_dotnet_loaded
def get_DAX(connection_string, dax_string):
"""
Executes DAX query and returns the results as a pandas DataFrame
Parameters
---------------
connection_string : string
Valid SSAS connection string, use the set_conn_string() method to set
dax_string : string
Valid DAX query, beginning with EVALUATE or VAR or DEFINE
Returns
----------------
pandas DataFrame with the results
"""
table = _get_DAX(connection_string, dax_string)
df = _parse_DAX_result(table)
return df
def _get_DAX(connection_string, dax_string) -> "DataTable":
dataadapter = ADOMD.AdomdDataAdapter(dax_string, connection_string)
table = DataTable()
logger.info("Getting DAX query...")
dataadapter.Fill(table)
logger.info("DAX query successfully retrieved")
return table
def _parse_DAX_result(table: "DataTable") -> pd.DataFrame:
cols = [ c for c in table.Columns.List ]
rows = [ ]
# much better performance to just access data by position instead of name
# and then add column names afterwards
for r in range(table.Rows.Count):
row = [ table.Rows[ r ][ c ] for c in cols ]
rows.append(row)
df = pd.DataFrame.from_records(rows, columns=[ c.ColumnName for c in cols ])
# replace System.DBNull with None
# df.replace({System.DBNull: np.NaN}) doesn't work for some reason
df = df.applymap(lambda x: np.NaN if isinstance(x, System.DBNull) else x)
# convert datetimes
dt_types = [ c.ColumnName for c in cols if c.DataType.FullName == "System.DateTime" ]
if dt_types:
for dtt in dt_types:
# if all nulls, then pd.to_datetime will fail
if not df.loc[ :, dtt ].isna().all():
# https://docs.microsoft.com/en-us/dotnet/standard/base-types/standard-date-and-time-format-strings#Sortable
ser = df.loc[ :, dtt ].map(lambda x: x.ToString('s'))
df.loc[ :, dtt ] = pd.to_datetime(ser)
# convert other types
types_map = {"System.Int64": int, "System.Double": float, "System.String": str}
col_types = {c.ColumnName: types_map.get(c.DataType.FullName, "object") for c in cols}
# handle NaNs (which are floats, as of pandas v.0.25.3) in int columns
col_types_ints = {k for k, v in col_types.items() if v == int}
ser = df.isna().any(axis=0)
col_types.update({k: float for k in set(ser[ ser ].index).intersection(col_types_ints)})
# convert
df = df.astype(col_types)
return df
@_assert_dotnet_loaded
def process_database(connection_string, refresh_type, db_name):
process_model(
connection_string=connection_string,
item_type="model",
refresh_type=refresh_type,
db_name=db_name,
)
@_assert_dotnet_loaded
def process_table(connection_string, table_name, refresh_type, db_name):
process_model(
connection_string=connection_string,
item_type="table",
item=table_name,
refresh_type=refresh_type,
db_name=db_name,
)
@_assert_dotnet_loaded
def process_model(connection_string, db_name, refresh_type="full", item_type="model", item=None):
"""
Processes SSAS data model to get new data from underlying source.
Parameters
-------------
connection_string : string
Valid SSAS connection string, use the set_conn_string() method to set
db_name : string
The data model on the SSAS server to process
refresh_type : string, default `full`
Type of refresh to process. Currently only supports `full`.
item_type : string, choice of {'model','table'}, default 'model'
item : string, optional.
Then name of the item. Only needed when item_type is 'table', to specify the table name
"""
assert item_type.lower() in ("table", "model"), f"Invalid item type: {item_type}"
if item_type.lower() == "table" and not item:
raise ValueError("If item_type is table, must supply an item (a table name) to process")
# connect to the AS instance from Python
AMOServer = AMO.Server()
logger.info("Connecting to database...")
AMOServer.Connect(connection_string)
# Dict of refresh types
refresh_dict = {"full": AMO.RefreshType.Full}
# process
db = AMOServer.Databases[ db_name ]
if item_type.lower() == "table":
table = db.Model.Tables.Find(item)
table.RequestRefresh(refresh_dict[ refresh_type ])
else:
db.Model.RequestRefresh(refresh_dict[ refresh_type ])
op_result = db.Model.SaveChanges()
if op_result.Impact.IsEmpty:
logger.info("No objects affected by the refresh")
logger.info("Disconnecting from Database...")
# Disconnect
AMOServer.Disconnect()
No you do not need to install System
or any of the .Net imports, those are .Net libraries that we are loading through Python using Pythonnet. This should work as written. Is it working?
No it doesn't seem to be working. The errors I find when I try running my Quickstart file (as stated above) are:
NameError: name 'DataTable' is not defined
: Which I think is caused by the for line of code in the ssas_api.py script not being detected. During handling of the above exception, another exception occurred: FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:\\Windows\\Microsoft.NET\\assembly\\GAC_MSIL\\Microsoft.AnalysisServices.Tabular'
: Which I think is the result of the .dll files not being in the location required/ assumed by your ssas_api.py script? As a result might need to change the .dll file location in the script using the python snippet provided? I might be wrong about this?I just want to ensure that if we take the Quickstart & ssas_api.py (as is) into the same directory and run the Quickstart it should work as is? I am not entirely sure if I need to change anything or am doing something wrong. I just want to be able send a DAX query against AAS and receive a response and store the result in a dataframe. Not sure if you are able to provide further guidance on how to do this?
I did also follow the steps in the README.md "Getting The Required .Net Libraries" regarding the .net libraries and confirmed that the following python snippet applies in my case:
base = "C:/Program Files/PackageManagement/NuGet/Packages/Microsoft.AnalysisServices"
_version = "19.20.1.0" # at time of this writing
AMO_PATH = f"{base}.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.Tabular.dll"
ADOMD_PATH = f"{base}.AdomdClient.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.AdomdClient.dll"
Do I need to copy this info into the ssas_api.py file in anyway for it to run?
yes that is the issue. your .Net DLLs are not being found. All of the functions optionally take params to specify where they are (passing to the underlying _load_assemblies
decorator function), if no params are passed then it looks in the default locations, see the code in the snippet below.
https://github.com/yehoshuadimarsky/python-ssas/blob/fc189377d6121b91fe262e44b68b5351cb6c7bf6/ssas_api.py#L47-L59
So if it's not finding it in the default location, you should pass the location as params, such as this
df = ssas_api.get_DAX(
connection_string=conn,
dax_string=dax_string,
amo_path='C:/path/number/one',
adomd_path='C:/path/number/two'
)
Example in the code/docstring: https://github.com/yehoshuadimarsky/python-ssas/blob/fc189377d6121b91fe262e44b68b5351cb6c7bf6/ssas_api.py#L83-L96
Ok, that makes sense however even when I used the file paths as specififed ie AMO_PATH
& ADOMD_PATH
in the fashion that you are suggesting (inside the get_DAX
function) it stills throws the error. I have attached a screenshot of the error and the code chunk I think its referring too.
That doesn't make sense.
The error that I am still getting is the same as what I described above.
NameError: DataTable not defined
and if you look above you can see the greyed out code that I was mentioning. For some reason this code isn't being detected by Pycharm IDE not sure what I am missing or why this is but as a result it seems the ssas_api.py script doesn't work/execute as intended and hence cannot define the DataTable
as the error says.Hey @nisharncslabs, I know it's been a while since you created this issue and I'm not the owner of this project. But since i encountered the same issues I want to share how I solved them and my understanding on whats the problem in your case.
First of all: The lines 70-73 should be highlighted by your IDE. It reads these imports and looks for matching python packages, which are obviously not available. The imports are loaded through pythonnet from the .DLL files. But this is just a warning and can be safely ignored, because we know why they occur.
Now to your error message. I think python is a bit misleading in this case. It reports two exceptions:
Because the second exception is thrown in the except block python shows you both exception.
Now to solve the problem: (I can only tell you how I solved it 🙈)
I copied both .dll files (Tabular and AdomdClient) to the root directory of my project.
In my first call to python_ssas I passed absolute paths to both files.
conn = powerbi.set_conn_string('workspace', 'datset', 'user', 'password', amo_path='C:\\Users\\...\\Microsoft.AnalysisServices.Tabular.dll', adomd_path='C:\\Users\\...\\Microsoft.AnalysisServices.AdomdClient.dll')
For me this solves the error. If this isn't working for you play around with importing the dlls for a bit: Create a new python file to test just the import without an extra stuff.
import clr
clr.AddReference("System")
clr.AddReference("System.Data")
clr.AddReference(<path to your tabular dll>)
clr.AddReference(<path to your adomdclient dll>)
import System
from System.Data import DataTable
import Microsoft.AnalysisServices.Tabular as AMO
import Microsoft.AnalysisServices.AdomdClient as ADOMD
Adjust the paths until you get this script running and then supply the same paths to your first call to one of the python_ssas functions.
Hope this helps 😃
Hi, I followed the steps in the README.md file and imported the required dll extension (and edited the required version info in the python file path snippet provided as well).
However, once I go to run the a version of the Quickstart I run into an error:
NameError: name 'DataTable' is not defined
.Here is the python file path snippet with the correct version and path that I have on my machine:
My question is:
ssas_api.py
script I have on my local machine (line 70-73) the following commands aren't working (greyed out in PyCharm Editor):Not sure if I need to install a package for this to work. I tried performing
pip install System
but that didn't seem to work, it gave the following errorERROR: Could not find a version that satisfies the requirement System (from versions: none)