RuiRomano / pbimonitor

MIT License
207 stars 76 forks source link

Not all datasets return schema detail #24

Closed jhoolachan closed 1 year ago

jhoolachan commented 1 year ago

Hi Rui,

Fantastic tool and repo...thank you! We are running into an issue that we have been unable to resolve.

Some but not all datasets do not return any schema info. In particular, our main enterprise data model is present in each of our three environments (Dev, Test, Prod). It is promoted through environments via a deployment pipeline. Pbimonitor returns schema information for the dataset in Dev and Test but does not return it for the dataset in Prod. All the correct tenant settings have been enabled.

For troubleshooting, we deleted all of the subfolders in the "Data" folder along with the state file to force a full scan. We also set the catalog URI parameters to:

GetModified parameters 'excludePersonalWorkspaces=false&excludeInActiveWorkspaces=false'
GetInfo parameters 'lineage=true&datasourceDetails=true&getArtifactUsers=true&datasetSchema=true&datasetExpressions=true'

Do you have any suggestions? Are there any settings/characteristics of a dataset that would prevent it from returning schema data?

It is probably unrelated but the "Activity" script has also thrown the following error which was resolved by modifying the depth parameter to 6:

Getting audit data for: '20221005'
Writing '1844' audits
WARNING: Resulting JSON is truncated as serialization has exceeded the set depth of 5.
jhoolachan commented 1 year ago

In case anyone with the same issue finds this, the datasets have to undergo a full refresh (per the Microsoft support engineer assigned to our case) to meet the caching criteria mentioned in the documentation. We are the using the XMLA endpoint to perform table-level refreshes which apparently does not trigger the caching which in turn prevents the scanner API from returning dataset schema info. The solution in our case was to schedule a refresh of the full dataset each weekend during off hours to ensure caching occurs.