migueesc123 / PowerBIRESTAPI

A Microsoft Power BI Data Connector or Power Query Connector for the Power BI REST API
MIT License
238 stars 74 forks source link

Data source error: column does not exist in the rowset. #56

Closed deepu299 closed 3 years ago

deepu299 commented 3 years ago

Hi @migueesc123, Since ActivityLog is looking for ActivityLogType for the given range, I am seeing these kind of errors on a timely basis. The refresh works without any issue from PBI desktop for same date range (I am using last 30 days). From Power BI service it works sometimes, but an error would occur, I assume when some activity type is not part of the range (?) Can you please help on how to resolve these?

  1. Data source error: | The 'ExportedArtifactInfo' column does not exist in the rowset. Table: Event Activity Log.

  2. Data source error: | The 'WorkspaceAccessList' column does not exist in the rowset. Table: Event Activity Log.

Thank you.

migueesc123 commented 3 years ago

Yeah, the main challenge with that endpoint is that MSFT doesn't provide any sort of list of possible fields that could come up from it, so what the connector does it that it implements some heuristics (using Table.Combine) based on the fields available when you create your query.

There are 2 ways to avoid these:

  1. You stop adding a Changed Type step to your query or, overall, stop adding any sort of fixed reference to column names in your query.
  2. You provide a schema to the table and make it so that the output of that query is always that same table schema.
  3. We completely change how this table gets created in the custom connector so it provides something that wouldn't trigger an error, but it would be 10x harder for the average user to try to gain some insights because the data would be in a shape that is not so easy to distinguish for the regular user.
deepu299 commented 3 years ago

Thanks for the quick response Miguel. I removed all the steps and monitored these 10 days, it is still failing with a similar error. I will check and follow your suggestions to see if something would work, Otherwise I will exclude this table and use other queries.

thanks a lot, this is very useful for my case.

b77uw771b commented 2 years ago

@migueesc123 Can you please elaborate on the suggestion to: "provide a schema to the table and make it so that the output of that query is always that same table schema"? Is there somewhere I can learn to do this? Thanks for your help!

migueesc123 commented 2 years ago

I don't have any resources, but basically what I'd suggest is to learn more about what partitions are in SSAS Tabular and how important the schema is. In some cases, the refresh of this specific function will give you more columns and in others less or no columns depending on your usage of the Power BI Service - this means that the schema of the table is not a constant and you'll need to address that fact to make things work for a scenario where you need to create partitions (where all partitions should share the same schema)

b77uw771b commented 2 years ago

I was able to get incremental refresh working on the Event Activity Log API connector using the advice from your blog post about SharePoint. I was able to figure out how to define the default fixed schema in M. I specified the columns I wanted from the function, then created the empty table with same schema, as a fallback.

**

let Source = PowerBIRESTAPI.Navigation(), Functions = Source{[Key = "Functions"]}[Data], EventActivityLog = Functions{[Key = "EventActivityLog"]}[Data],

"Invoke Function EventActivityLog" = EventActivityLog(Date.From(RangeStart), Date.From(RangeEnd)),

"Columns To Keep" = Table.SelectColumns(

#"Invoke Function EventActivityLog", 
{"Id", "CreationTime", "Operation", "UserId", "Activity"}

),

"Columns To Keep SCHEMA ONLY" = #table(

type table [Id = number, CreationTime = text, Operation = text, UserId = text, Activity = text], 
{}

) in try #"Columns To Keep" otherwise #"Columns To Keep SCHEMA ONLY"

**

Now I have a PBIX in the service with incremental refresh running on any window of time that I choose! Yay! Thanks for the guidance! Couldn't have done this without you!!!

nheuk commented 1 year ago

You inspired me with your solution, however I wanted the highest common denominator and not the lowest. I would like to share my solution with you, first I manually executed the EventActicityLog API with the maximum time interval of 30 days to get the maximum table structure.

let
    //Tabellen Schema definieren
    tableSchema = 
        #table(type table 
        [
            Activity = text
            , ActivityId = text
            , AggregatedWorkspaceInformation = text
            , AppId = text
            , AppName = text
            , AppReportId = text
            , ArtifactId = text
            , ArtifactKind = text
            , ArtifactName = text
            , ArtifactObjectId = text
            , AuditedArtifactInformation = text
            , CapacityId = text
            , CapacityName = text
            , ClientIP = text
            , ConsumptionMethod = text
            , CreationTime = datetime
            , CustomVisualAccessTokenResourceId = text
            , CustomVisualAccessTokenSiteUri = text
            , DashboardId = text
            , DashboardName = text
            , DataConnectivityMode = text
            , DataflowAccessTokenRequestParameters = text
            , DataflowAllowNativeQueries = text
            , DataflowId = text
            , DataflowName = text
            , DataflowRefreshScheduleType = text
            , DataflowType = text
            , DatasetId = text
            , DatasetName = text
            , Datasets = text
            , DatasourceId = text
            , DatasourceObjectIds = text
            , DeploymentPipelineId = text
            , DeploymentPipelineObjectId = text
            , DeploymentPipelineStageOrder = text
            , DistributionMethod = text
            , EmbedTokenId = text
            , EndPoint = text
            , Experience = text
            , ExportedArtifactInfo = text
            , ExportEventEndDateTimeParameter = text
            , ExportEventStartDateTimeParameter = text
            , ExternalResource = text
            , ExternalSubscribeeInformation = text
            , FolderDisplayName = text
            , FolderObjectId = text
            , GatewayClusterDatasources = text
            , GatewayClusterId = text
            , GatewayClusters = text
            , GatewayId = text
            , GatewayStatus = text
            , HasFullReportAttachment = text
            , Id = text
            , ImportDisplayName = text
            , ImportId = text
            , ImportSource = text
            , ImportType = text
            , InstallTeamsAnalyticsInformation = text
            , IsSuccess = text
            , IsTenantAdminApi = text
            , ItemName = text
            , ItemsCount = text
            , LastRefreshTime = text
            , ModelId = text
            , ModelsSnapshots = text
            , Monikers = text
            , ObjectDisplayName = text
            , ObjectId = text
            , ObjectType = text
            , Operation = text
            , OrganizationId = text
            , OrgAppPermission = text
            , OriginalOwner = text
            , PinReportToTabInformation = text
            , RecordType = text
            , RefreshType = text
            , ReportId = text
            , ReportName = text
            , ReportType = text
            , RequestId = text
            , ResultStatus = text
            , Schedules = text
            , ShareLinkId = text
            , SharingAction = text
            , SharingInformation = text
            , SharingScope = text
            , SubscribeeInformation = text
            , SubscriptionSchedule = text
            , TableName = text
            , TakingOverOwner = text
            , UserAgent = text
            , UserId = text
            , UserInformation = text
            , UserKey = text
            , UserType = text
            , Workload = text
            , WorkspaceId = text
            , WorkSpaceName = text
        ],{}),

    // Definieren Sie die Liste der Spalten, die ausgewählt werden sollen
    columnsToSelect = Table.ColumnNames(tableSchema),

    // Event Activity Log inkrementell abrufen
    PowerBIRESTAPI = PowerBIRESTAPI.Navigation(),
    Functions = PowerBIRESTAPI{[Key = "Functions"]}[Data],
    EventActivityLog = Functions{[Key = "EventActivityLog"]}[Data],
    Invoke_Function_EventActivityLog = EventActivityLog(Date.From(RangeStart), Date.From(RangeEnd)),

    // Überprüfen Sie, welche Spalten in der Tabelle des Event Activity Logs vorhanden sind
    availableColumns = Table.ColumnNames(Invoke_Function_EventActivityLog), 
    existingColumns = List.Select(columnsToSelect, each List.Contains(availableColumns, _)),

    // Wenn mindestens eine Spalte vorhanden ist, nur die vorhandenen Spalten auswählen
    outputTable = if List.Count(existingColumns) > 0 then Table.SelectColumns(Invoke_Function_EventActivityLog, existingColumns) else Invoke_Function_EventActivityLog,

    // Event Activity Log an das vorgegeben Tabellen Schema anfügen
    Event_Activity_Log_Incremental = Table.Combine({tableSchema, outputTable}),
    Changed_Type = Table.TransformColumnTypes(Event_Activity_Log_Incremental,{{"CreationTime", type datetime}})
in
    Changed_Type

This code first defines the table schema that contains the expected column names. Then, it checks the available columns in the retrieved table of the Event Activity Log, and only selects the existing columns. If at least one column is available, it selects only the existing columns and appends them to the predefined table schema. If no columns are available, it returns the original table unchanged.

By using this technique, you can ensure that your Power BI queries are error-free and provide reliable results, even if the data format changes or is updated.