microsoft / coe-starter-kit

Other
751 stars 221 forks source link

[CoE Starter Kit - BUG] BYODL Model Driven App Dataflow #7696

Closed nasilvae closed 8 months ago

nasilvae commented 8 months ago

Does this bug already exist in our backlog?

Describe the issue

There was a problem refreshing the dataflow, the dataflow definition contained errors. Please fix the problem and try again. Additional information: Unexpected exception in query "Apps", Message: Expression.Error: There weren't enough elements in the enumeration to complete the operation..

Expected Behavior

Dataflow should run

What solution are you experiencing the issue with?

Core

What solution version are you using?

4.22

What app or flow are you having the issue with?

Model Driven App Dataflow

What method are you using to get inventory and telemetry?

Data Export

Steps To Reproduce

No response

Anything else?

No response

AB#2342

Jenefer-Monroe commented 8 months ago

Can you confirm, did this still repro after the successful envt dataflow refresh?

Jenefer-Monroe commented 8 months ago

OK strange ask but can you try this? I found that the Data Export file for Apps had started to include MDAs so I wonder if its because bad data was getting written in the Apps table for MDAs.

  1. Open the PowerApps Apps table where you can delete rows and delete MDA rows without an App Unique Name. (Alternatively you could run the Check Deleted flows)

image

  1. Open the CoE BYODL Apps Dataflow and put this in the Advanced Editor for Apps query image

    let
    Source = AzureStorage.DataLake(DatalakeURL & "/powerapps/Applications"),
    #"Filtered rows" = Table.SelectRows(Source, each Text.EndsWith([Name], "canvas.json")),
    #"Changed last modified" = Table.TransformColumnTypes(Source, {{"Date modified", type date}}),
    // Filter to files that have been updated since the last Dataflow refresh OR all files if initial inventory
    #"Filter to recent" = Table.SelectRows( #"Changed last modified", each [Date modified] >= (if List.IsEmpty(DataflowRefresh) then Date.From("1900-01-01") else Date.From(DataflowRefresh{0}))),
    excludehiddenfiles = Table.SelectRows(#"Filter to recent", each [Attributes]?[Hidden]? <> true),
    excludezerolengthfiles = Table.SelectRows(excludehiddenfiles, each [Attributes]?[Size]? > 0),
    // Extract Environment GUID from File Name
    #"Extract Environment ID from File Name" = Table.TransformColumnTypes(Table.AddColumn(excludezerolengthfiles, "EnvironmentNameFromTable", each if Text.Length([Name]) = 47 then Text.Range([Name], 0,36) else Text.Range(Text.AfterDelimiter([Name], "-", 0), 0, 36)), {{"EnvironmentNameFromTable", type text}}),
    // Remove unnecessary columns from the folder system, like extension and folder path.
    #"Removed unneeded columns" = Table.RemoveColumns(#"Extract Environment ID from File Name", {"Name", "Folder Path", "Attributes", "Date created", "Date modified", "Date accessed", "Extension"}),
    // Inner join of app table with environments table, only keep apps where the environment still exists. This filters out apps from deleted environments.
    #"Merge with environments" = Table.NestedJoin(#"Removed unneeded columns", {"EnvironmentNameFromTable"}, admin_environments, {"admin_environmentid"}, "admin_environments", JoinKind.Inner),
    #"Expanded Environments" = Table.ExpandTableColumn(#"Merge with environments", "admin_environments", {"admin_environmentid", "admin_displayname"}, {"admin_environmentid", "admin_environmentdisplayname"}),
    // Transform the content from the json files in the storage account into tables
    #"Get content from json files" = Table.AddColumn(#"Expanded Environments", "Transform file", each #"Transform file"([Content])),
    // If there any errors in the transformation, we assume the json file in the datalake has become corrupted and we remove those errors to enable further processing of the data
    #"Removed errors" = Table.RemoveRowsWithErrors(#"Get content from json files", {"Transform file"}),
    // Expand all the columns the json file in the storage account provides into individual columns, e.g. app name, app id, created by, created on...
    #"Expand columns" = Table.ExpandTableColumn(#"Removed errors", "Transform file", {"resourceId", "environmentId", "subType", "DocumentVersion", "name", "description", "uri", "tenantId", "lifecycleState", "owner", "createdTime", "createdPrincipalId", "lastModifiedTime", "lastModifiedPrincipalId", "lastEnabledTime", "lastEnabledPrincipalId", "sharedUsers", "sharedGroups", "appPlanClassification", "bypassConsent", "canConsumeAppPass", "sharepointFormURL", "status", "almMode", "usesDataverseTables", "dlpEvaluationStatus", "dlpLastEvaluationDate", "dlpViolationErrorMessage", "usesPcfExternalServiceUsage", "backgroundImageUri", "backgroundColor", "usesCustomApi", "usesPremiumApi", "usesOnPremiseGateway"}, {"appId", "environmentId", "subType", "DocumentVersion", "name", "description", "uri", "tenantId", "lifecycleState", "owner", "createdTime", "createdPrincipalId", "lastModifiedTime", "lastModifiedPrincipalId", "lastEnabledTime", "lastEnabledPrincipalId", "sharedUsers", "sharedGroups", "appPlanClassification", "bypassConsent", "canConsumeAppPass", "sharepointFormURL", "status", "almMode", "usesDataverseTables", "dlpEvaluationStatus", "dlpLastEvaluationDate", "dlpViolationErrorMessage", "usesPcfExternalServiceUsage", "iconUrl", "backgroundColor", "usesCustomApi", "usesPremiumApi", "usesOnPremiseGateway"}),
    #"Removed appId errors" = Table.RemoveRowsWithErrors(#"Expand columns", {"appId"}),
    // App ID has to be unique in this table, remove duplicates which can occur when files are processed incorrectly
    #"Removed duplicate appIds" = Table.Distinct(#"Removed appId errors", {"appId", "environmentId"}),
    // Transform the Power Apps subType column into the text format required by the Choice column in the PowerApps App table. Change the text to something understandable vs internal names.
    #"Add Power Apps Type" = Table.TransformColumnTypes(Table.AddColumn(#"Removed duplicate appIds", "PowerApps Type", each if [subType] = "AppModule" then "Model Driven" else if [subType] = "CustomCanvasPage" then "Custom Page" else if [subType] = "ClassicCanvasApp" then "Canvas" else if [subType] = "CanvasApp" then "Component Library" else null), {{"PowerApps Type", type text}}),
    // If the app has been created by SYSTEM user, replace in the SYSTEM user id
    #"Replaced SYSTEM createdBy" = Table.ReplaceValue(#"Add Power Apps Type", null, Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid], Replacer.ReplaceValue, {"createdPrincipalId"}),
    // If the app has been modified by the SYSTEM user, replace in the SYSTEM user id from the maker table.
    #"Replaced SYSTEM modifiedBY" = Table.ReplaceValue(#"Replaced SYSTEM createdBy", null, Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid], Replacer.ReplaceValue, {"lastModifiedPrincipalId"}),
    #"Merged with Apps" = Table.NestedJoin(#"Replaced SYSTEM modifiedBY", {"appId"}, admin_apps, {"admin_appid"}, "admin_apps", JoinKind.LeftOuter),
    #"Expanded admin_apps" = Table.ExpandTableColumn(#"Merged with Apps", "admin_apps", {"admin_appid", "_admin_appenvironment_value", "admin_appownermakerid"}, {"admin_appid", "_admin_appenvironment_value", "admin_appownermakerid"}),
    // Merge apps with maker to get more maker details, such as maker display name, department, is orphaned etc for furhter logic (is the app orphaned)
    #"Merged with makers" = Table.NestedJoin(#"Expanded admin_apps", {"admin_appownermakerid"}, admin_makers, {"admin_makerid"}, "admin_makers", JoinKind.LeftOuter),
    #"Expanded maker table" = Table.ExpandTableColumn(#"Merged with makers", "admin_makers", {"admin_makerid", "admin_userprincipalname", "admin_makerisorphaned", "admin_department", "admin_displayname", "admin_company", "admin_city", "admin_country"}, {"admin_ownermakerid", "admin_userprincipalname", "admin_makerisorphaned", "admin_department", "admin_displayname", "admin_company", "admin_city", "admin_country"}),
    // The table may hold stale data, add utcNow as queried on to be able to identify when the dataflow last queried a record.
    #"Add QueriedOn" = Table.TransformColumnTypes(Table.AddColumn(#"Expanded maker table", "QueriedOn", each DateTimeZone.UtcNow()), {{"QueriedOn", type datetimezone}}),
    #"Inserted conditional column" = Table.AddColumn(#"Add QueriedOn", "markDeleted", each if [appId] <> null then false else true),
    #"Inserted conditional column 1" = Table.AddColumn(#"Inserted conditional column", "markDeletedDate", each if [markDeleted] = "Yes" then [QueriedOn] else null),
    // Add yes/no text field to identify if the app is owned by a maker that no longer exists in the org
    #"Add IsOrphan" = Table.AddColumn(#"Inserted conditional column 1", "IsOrphanText", each if [admin_makerisorphaned] = false then "No" else if [admin_makerisorphaned] = true then "Yes" else null, type text),
    // Merge environment and app id into one text field as the most unique identifier for an app in a tenant
    #"Merge environmentid and appid" = Table.AddColumn(#"Add IsOrphan", "AppUniqueName", each Text.Combine({[EnvironmentNameFromTable], "_", [appId]}), type text),
    #"Removed Content binary column" = Table.RemoveColumns(#"Merge environmentid and appid", {"Content"}),
    #"Changed column type" = Table.TransformColumnTypes(#"Removed Content binary column", {{"EnvironmentNameFromTable", type text}, {"admin_environmentid", type text}, {"admin_environmentdisplayname", type text}, {"appId", type text}, {"environmentId", type text}, {"subType", type text}, {"DocumentVersion", type text}, {"name", type text}, {"description", type text}, {"uri", type text}, {"tenantId", type text}, {"lifecycleState", type text}, {"owner", type text}, {"createdTime", type datetime}, {"createdPrincipalId", type text}, {"lastModifiedTime", type datetime}, {"lastModifiedPrincipalId", type text}, {"lastEnabledTime", type datetime}, {"lastEnabledPrincipalId", type text}, {"sharedUsers", Int64.Type}, {"sharedGroups", Int64.Type}, {"PowerApps Type", type text}, {"admin_appownermakerid", type text}, {"admin_ownermakerid", type text}, {"admin_userprincipalname", type text}, {"admin_makerisorphaned", type logical}, {"admin_department", type text}, {"admin_displayname", type text}, {"admin_company", type text}, {"admin_city", type any}, {"admin_country", type any}, {"QueriedOn", type datetimezone}, {"IsOrphanText", type text}, {"AppUniqueName", type text}, {"markDeleted", type logical}, {"markDeletedDate", type datetimezone}, {"usesCustomApi", type text}, {"usesPremiumApi", type text}, {"usesOnPremiseGateway", type text}}),
    #"Remove MDAs" = Table.SelectRows(#"Changed column type", each [subType] <> "AppModule")
    in
    #"Remove MDAs"
  2. Publish that change

  3. Rerun MDA dataflow

nasilvae commented 8 months ago

When we checked the PowerApps Apps table it did not had any MDAs. We did noticed however multiple entries of records that the App type column was empty.

image The app dataflow is not able run yet, after applying the fix we get the same error.

Jenefer-Monroe commented 8 months ago

ok looks like the same error then. thats a bummer. ill see if Manuela has an idea for your issue.

nasilvae commented 8 months ago

ok looks like the same error then. thats a bummer. ill see if Manuela has an idea for your issue.

We noticed that there is over 5k records that do no have an app type nor an unique name (unique name is empty). We are performing a bulk delete on these records just in case.

Jenefer-Monroe commented 8 months ago

ok im making sure the way the export tracks app type hasnt changed in need on an update.

Jenefer-Monroe commented 8 months ago

ok the blank types were likely component libraries. also a change to the data export file. updated: https://github.com/microsoft/coe-starter-kit/issues/7697

nasilvae commented 8 months ago

Hey so, I have this modified query for the apps dataflow. This helps with the validation of the null values in the admin_maker table that seems to be causing the issue. So far the query seems to be working. let Source = AzureStorage.DataLake(DatalakeURL & "/powerapps/Applications"),

"Filtered rows" = Table.SelectRows(Source, each Text.EndsWith([Name], "canvas.json")),

#"Changed last modified" = Table.TransformColumnTypes(Source, {{"Date modified", type date}}),
// Filter to files that have been updated since the last Dataflow refresh OR all files if initial inventory
#"Filter to recent" = Table.SelectRows(#"Changed last modified", each [Date modified] >= (if List.IsEmpty(DataflowRefresh) then Date.From("1900-01-01") else Date.From(DataflowRefresh{0}))),
excludehiddenfiles = Table.SelectRows(#"Filter to recent", each [Attributes]?[Hidden]? <> true),
excludezerolengthfiles = Table.SelectRows(excludehiddenfiles, each [Attributes]?[Size]? > 0),
// Extract Environment GUID from File Name
#"Extract Environment ID from File Name" = Table.TransformColumnTypes(Table.AddColumn(excludezerolengthfiles, "EnvironmentNameFromTable", each if Text.Length([Name]) = 47 then Text.Range([Name], 0,36) else Text.Range(Text.AfterDelimiter([Name], "-", 0), 0, 36)), {{"EnvironmentNameFromTable", type text}}),
// Remove unnecessary columns from the folder system, like extension and folder path.
#"Removed unneeded columns" = Table.RemoveColumns(#"Extract Environment ID from File Name", {"Name", "Folder Path", "Attributes", "Date created", "Date modified", "Date accessed", "Extension"}),
// Inner join of app table with environments table, only keep apps where the environment still exists. This filters out apps from deleted environments.
#"Merge with environments" = Table.NestedJoin(#"Removed unneeded columns", {"EnvironmentNameFromTable"}, admin_environments, {"admin_environmentid"}, "admin_environments", JoinKind.Inner),
#"Expanded Environments" = Table.ExpandTableColumn(#"Merge with environments", "admin_environments", {"admin_environmentid", "admin_displayname"}, {"admin_environmentid", "admin_environmentdisplayname"}),
// Transform the content from the json files in the storage account into tables
#"Get content from json files" = Table.AddColumn(#"Expanded Environments", "Transform file", each #"Transform file"([Content])),
// If there any errors in the transformation, we assume the json file in the datalake has become corrupted and we remove those errors to enable further processing of the data
#"Removed errors" = Table.RemoveRowsWithErrors(#"Get content from json files", {"Transform file"}),
// Expand all the columns the json file in the storage account provides into individual columns, e.g. app name, app id, created by, created on...
#"Expand columns" = Table.ExpandTableColumn(#"Removed errors", "Transform file", {"resourceId", "environmentId", "subType", "DocumentVersion", "name", "description", "uri", "tenantId", "lifecycleState", "owner", "createdTime", "createdPrincipalId", "lastModifiedTime", "lastModifiedPrincipalId", "lastEnabledTime", "lastEnabledPrincipalId", "sharedUsers", "sharedGroups", "appPlanClassification", "bypassConsent", "canConsumeAppPass", "sharepointFormURL", "status", "almMode", "usesDataverseTables", "dlpEvaluationStatus", "dlpLastEvaluationDate", "dlpViolationErrorMessage", "usesPcfExternalServiceUsage", "backgroundImageUri", "backgroundColor", "usesCustomApi", "usesPremiumApi", "usesOnPremiseGateway"}, {"appId", "environmentId", "subType", "DocumentVersion", "name", "description", "uri", "tenantId", "lifecycleState", "owner", "createdTime", "createdPrincipalId", "lastModifiedTime", "lastModifiedPrincipalId", "lastEnabledTime", "lastEnabledPrincipalId", "sharedUsers", "sharedGroups", "appPlanClassification", "bypassConsent", "canConsumeAppPass", "sharepointFormURL", "status", "almMode", "usesDataverseTables", "dlpEvaluationStatus", "dlpLastEvaluationDate", "dlpViolationErrorMessage", "usesPcfExternalServiceUsage", "iconUrl", "backgroundColor", "usesCustomApi", "usesPremiumApi", "usesOnPremiseGateway"}),
#"Removed appId errors" = Table.RemoveRowsWithErrors(#"Expand columns", {"appId"}),
// App ID has to be unique in this table, remove duplicates which can occur when files are processed incorrectly
#"Removed duplicate appIds" = Table.Distinct(#"Removed appId errors", {"appId", "environmentId"}),
// Transform the Power Apps subType column into the text format required by the Choice column in the PowerApps App table. Change the text to something understandable vs internal names.
#"Add Power Apps Type" = Table.TransformColumnTypes(Table.AddColumn(#"Removed duplicate appIds", "PowerApps Type", each if [subType] = "AppModule" then "Model Driven" else if [subType] = "CustomCanvasPage" then "Custom Page" else if [subType] = "ClassicCanvasApp" then "Canvas" else if [subType] = "CanvasApp" then "Component Library" else null), {{"PowerApps Type", type text}}),
// If the app has been created by SYSTEM user, replace in the SYSTEM user id
#"Replaced SYSTEM createdBy" = Table.AddColumn(#"Add Power Apps Type", "ActualCreatedPrincipalId", each if [createdPrincipalId] = null then Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid] else [createdPrincipalId], type text),
// If the app has been modified by the SYSTEM user, replace in the SYSTEM user id from the maker table.
#"Replaced SYSTEM modifiedBY" = Table.AddColumn(#"Replaced SYSTEM createdBy", "ActualLastModifiedPrincipalId", each if [lastModifiedPrincipalId] = null then Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid] else [lastModifiedPrincipalId]),
#"Merged with Apps" = Table.NestedJoin(#"Replaced SYSTEM modifiedBY", {"appId"}, admin_apps, {"admin_appid"}, "admin_apps", JoinKind.LeftOuter),
#"Expanded admin_apps" = Table.ExpandTableColumn(#"Merged with Apps", "admin_apps", {"admin_appid", "_admin_appenvironment_value", "admin_appownermakerid"}, {"admin_appid", "_admin_appenvironment_value", "admin_appownermakerid"}),
// Merge apps with maker to get more maker details, such as maker display name, department, is orphaned etc for further logic (is the app orphaned)
#"Merged with makers" = Table.NestedJoin(#"Expanded admin_apps", {"admin_appownermakerid"}, admin_makers, {"admin_makerid"}, "admin_makers", JoinKind.LeftOuter),
#"Expanded maker table" = Table.ExpandTableColumn(#"Merged with makers", "admin_makers", {"admin_makerid", "admin_userprincipalname", "admin_makerisorphaned", "admin_department", "admin_displayname", "admin_company", "admin_city", "admin_country"}, {"admin_ownermakerid", "admin_userprincipalname", "admin_makerisorphaned", "admin_department", "admin_displayname", "admin_company", "admin_city", "admin_country"}),
// The table may hold stale data, add utcNow as queried on to be able to identify when the dataflow last queried a record.
#"Add QueriedOn" = Table.TransformColumnTypes(Table.AddColumn(#"Expanded maker table", "QueriedOn", each DateTimeZone.UtcNow()), {{"QueriedOn", type datetimezone}}),
#"Inserted conditional column" = Table.AddColumn(#"Add QueriedOn", "markDeleted", each if [appId] <> null then false else true),
#"Inserted conditional column 1" = Table.AddColumn(#"Inserted conditional column", "markDeletedDate", each if [markDeleted] = true then [QueriedOn] else null),
// Add yes/no text field to identify if the app is owned by a maker that no longer exists in the org
#"Add IsOrphan" = Table.AddColumn(#"Inserted conditional column 1", "IsOrphanText", each if [admin_makerisorphaned] = false then "No" else if [admin_makerisorphaned] = true then "Yes" else null, type text),
// Merge environment and app id into one text field as the most unique identifier for an app in a tenant
#"Merge environmentid and appid" = Table.AddColumn(#"Add IsOrphan", "AppUniqueName", each Text.Combine({[EnvironmentNameFromTable], "_", [appId]}), type text),
#"Removed Content binary column" = Table.RemoveColumns(#"Merge environmentid and appid", {"Content"}),
#"Changed column type" = Table.TransformColumnTypes(#"Removed Content binary column", {{"EnvironmentNameFromTable", type text}, {"admin_environmentid", type text}, {"admin_environmentdisplayname", type text}, {"appId", type text}, {"environmentId", type text}, {"subType", type text}, {"DocumentVersion", type text}, {"name", type text}, {"description", type text}, {"uri", type text}, {"tenantId", type text}, {"lifecycleState", type text}, {"owner", type text}, {"createdTime", type datetime}, {"ActualCreatedPrincipalId", type text}, {"lastModifiedTime", type datetime}, {"ActualLastModifiedPrincipalId", type text}, {"lastEnabledTime", type datetime}, {"lastEnabledPrincipalId", type text}, {"sharedUsers", Int64.Type}, {"sharedGroups", Int64.Type}, {"PowerApps Type", type text}, {"admin_appownermakerid", type text}, {"admin_ownermakerid", type text}, {"admin_userprincipalname", type text}, {"admin_makerisorphaned", type logical}, {"admin_department", type text}, {"admin_displayname", type text}, {"admin_company", type text}, {"admin_city", type any}, {"admin_country", type any}, {"QueriedOn", type datetimezone}, {"IsOrphanText", type text}, {"AppUniqueName", type text}, {"markDeleted", type logical}, {"markDeletedDate", type datetimezone}, {"usesCustomApi", type text}, {"usesPremiumApi", type text}, {"usesOnPremiseGateway", type text}}),
#"Remove MDAs" = Table.SelectRows(#"Changed column type", each [subType] <> "AppModule")

in

"Remove MDAs"

nasilvae commented 8 months ago

Now we only have one dataflow that is not working which is the model driven one. I tried using the same query above and change the filters to "adapt it" to mda but i didnt work. Let me know your thoughts on the query We are using for apps and if this is something that we can use for model driven apps dataflow. I suspect the model driven one will be fixed by a similar approach since it was the same error the enumeration error.

Jenefer-Monroe commented 8 months ago

Sorry can you please tell me what change you made in the above? That way I can validate it works. I cant just take the whole blob unfortunately.

My MDA one is working.

nasilvae commented 8 months ago

Sorry can you please tell me what change you made in the above? That way I can validate it works. I cant just take the whole blob unfortunately.

My MDA one is working.

my mda is having the enumeration issue still. The apps one used to have the enumeration issue but added some logic when it tries to get the values from the admin_maker table. Im not at the pc now but once I get a moment i can get more specific.

nasilvae commented 8 months ago

These Queries are based on the Apps Dataflow under the apps table. It fixed the enumerator issue but not sure how to apply this type of fix on the model driven app dataflow yet.

Query 1 is the Modified version I shared Query 2 is your version on this thread.

Here are the key differences:

Handling Null Values in "Replaced SYSTEM createdBy" and "Replaced SYSTEM modifiedBY": In Query 1, the columns "ActualCreatedPrincipalId" and "ActualLastModifiedPrincipalId" are explicitly set to type text, and then null values are replaced using Table.ReplaceValue function. In Query 2, the null values in the same columns are replaced directly using Table.ReplaceValue without explicitly setting the column types to text.

// Query 1

"Replaced SYSTEM createdBy" = Table.AddColumn(#"Add Power Apps Type", "ActualCreatedPrincipalId", each if [createdPrincipalId] = null then Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid] else [createdPrincipalId], type text),

"Replaced SYSTEM modifiedBY" = Table.AddColumn(#"Replaced SYSTEM createdBy", "ActualLastModifiedPrincipalId", each if [lastModifiedPrincipalId] = null then Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid] else [lastModifiedPrincipalId]),

// Query 2

"Replaced SYSTEM createdBy" = Table.ReplaceValue(#"Add Power Apps Type", null, Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid], Replacer.ReplaceValue, {"createdPrincipalId"}),

"Replaced SYSTEM modifiedBY" = Table.ReplaceValue(#"Replaced SYSTEM createdBy", null, Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid], Replacer.ReplaceValue, {"lastModifiedPrincipalId"}),

Handling Null Values in "Inserted conditional column 1": In Query 1, the "markDeletedDate" column is set to null for rows where [markDeleted] is true. The type of the column is datetimezone. In Query 2, the "markDeletedDate" column is set to null for rows where [markDeleted] is "Yes". The type of the column is datetimezone.

// Query 1

"Inserted conditional column 1" = Table.AddColumn(#"Inserted conditional column", "markDeletedDate", each if [markDeleted] = true then [QueriedOn] else null),

// Query 2

"Inserted conditional column 1" = Table.AddColumn(#"Inserted conditional column", "markDeletedDate", each if [markDeleted] = "Yes" then [QueriedOn] else null),

Jenefer-Monroe commented 8 months ago

the null type was due to a different issue, they changed how they marked up component libraries. So that part wont be needed.

Jenefer-Monroe commented 8 months ago

and I think the mark deleted is resolved by removing MDAs from the query. Can you please try this for the CoE BYODL Apps ddataflow apps query?

let
    Source = AzureStorage.DataLake(DatalakeURL & "/powerapps/Applications"),
  #"Filtered rows" = Table.SelectRows(Source, each Text.EndsWith([Name], "canvas.json")),
    #"Changed last modified" = Table.TransformColumnTypes(Source, {{"Date modified", type date}}),
  // Filter to files that have been updated since the last Dataflow refresh OR all files if initial inventory
  #"Filter to recent" = Table.SelectRows( #"Changed last modified", each [Date modified] >= (if List.IsEmpty(DataflowRefresh) then Date.From("1900-01-01") else Date.From(DataflowRefresh{0}))),
    excludehiddenfiles = Table.SelectRows(#"Filter to recent", each [Attributes]?[Hidden]? <> true),
    excludezerolengthfiles = Table.SelectRows(excludehiddenfiles, each [Attributes]?[Size]? > 0),
  // Extract Environment GUID from File Name
  #"Extract Environment ID from File Name" = Table.TransformColumnTypes(Table.AddColumn(excludezerolengthfiles, "EnvironmentNameFromTable", each if Text.Length([Name]) = 47 then Text.Range([Name], 0,36) else Text.Range(Text.AfterDelimiter([Name], "-", 0), 0, 36)), {{"EnvironmentNameFromTable", type text}}),
  // Remove unnecessary columns from the folder system, like extension and folder path.
  #"Removed unneeded columns" = Table.RemoveColumns(#"Extract Environment ID from File Name", {"Name", "Folder Path", "Attributes", "Date created", "Date modified", "Date accessed", "Extension"}),
  // Inner join of app table with environments table, only keep apps where the environment still exists. This filters out apps from deleted environments.
  #"Merge with environments" = Table.NestedJoin(#"Removed unneeded columns", {"EnvironmentNameFromTable"}, admin_environments, {"admin_environmentid"}, "admin_environments", JoinKind.Inner),
  #"Expanded Environments" = Table.ExpandTableColumn(#"Merge with environments", "admin_environments", {"admin_environmentid", "admin_displayname"}, {"admin_environmentid", "admin_environmentdisplayname"}),
  // Transform the content from the json files in the storage account into tables
  #"Get content from json files" = Table.AddColumn(#"Expanded Environments", "Transform file", each #"Transform file"([Content])),
  // If there any errors in the transformation, we assume the json file in the datalake has become corrupted and we remove those errors to enable further processing of the data
  #"Removed errors" = Table.RemoveRowsWithErrors(#"Get content from json files", {"Transform file"}),
  // Expand all the columns the json file in the storage account provides into individual columns, e.g. app name, app id, created by, created on...
  #"Expand columns" = Table.ExpandTableColumn(#"Removed errors", "Transform file", {"resourceId", "environmentId", "subType", "DocumentVersion", "name", "description", "uri", "tenantId", "lifecycleState", "owner", "createdTime", "createdPrincipalId", "lastModifiedTime", "lastModifiedPrincipalId", "lastEnabledTime", "lastEnabledPrincipalId", "sharedUsers", "sharedGroups", "appPlanClassification", "bypassConsent", "canConsumeAppPass", "sharepointFormURL", "status", "almMode", "usesDataverseTables", "dlpEvaluationStatus", "dlpLastEvaluationDate", "dlpViolationErrorMessage", "usesPcfExternalServiceUsage", "backgroundImageUri", "backgroundColor", "usesCustomApi", "usesPremiumApi", "usesOnPremiseGateway"}, {"appId", "environmentId", "subType", "DocumentVersion", "name", "description", "uri", "tenantId", "lifecycleState", "owner", "createdTime", "createdPrincipalId", "lastModifiedTime", "lastModifiedPrincipalId", "lastEnabledTime", "lastEnabledPrincipalId", "sharedUsers", "sharedGroups", "appPlanClassification", "bypassConsent", "canConsumeAppPass", "sharepointFormURL", "status", "almMode", "usesDataverseTables", "dlpEvaluationStatus", "dlpLastEvaluationDate", "dlpViolationErrorMessage", "usesPcfExternalServiceUsage", "iconUrl", "backgroundColor", "usesCustomApi", "usesPremiumApi", "usesOnPremiseGateway"}),
  #"Merged with Apps" = Table.NestedJoin(#"Expand columns", {"appId"}, admin_apps, {"admin_appid"}, "admin_apps", JoinKind.LeftOuter),
  #"Removed appId errors" = Table.RemoveRowsWithErrors(#"Merged with Apps", {"appId"}),
  // App ID has to be unique in this table, remove duplicates which can occur when files are processed incorrectly
  #"Removed duplicate appIds" = Table.Distinct(#"Removed appId errors", {"appId", "environmentId"}),
  // Transform the Power Apps subType column into the text format required by the Choice column in the PowerApps App table. Change the text to something understandable vs internal names.
  #"Add Power Apps Type" = Table.TransformColumnTypes(Table.AddColumn(#"Removed duplicate appIds", "PowerApps Type", each if [subType] = "AppModule" then "Model Driven" else if [subType] = "CustomCanvasPage" then "Custom Page" else if [subType] = "ClassicCanvasApp" then "Canvas" else if [subType] = "AppComponentLibrary" then "Component Library" else null), {{"PowerApps Type", type text}}),
  // If the app has been created by SYSTEM user, replace in the SYSTEM user id
  #"Replaced SYSTEM createdBy" = Table.ReplaceValue(#"Add Power Apps Type", null, Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid], Replacer.ReplaceValue, {"createdPrincipalId"}),
  // If the app has been modified by the SYSTEM user, replace in the SYSTEM user id from the maker table.
  #"Replaced SYSTEM modifiedBY" = Table.ReplaceValue(#"Replaced SYSTEM createdBy", null, Table.SelectRows(admin_makers, each [admin_displayname] = "SYSTEM"){0}[admin_makerid], Replacer.ReplaceValue, {"lastModifiedPrincipalId"}),
  #"Expanded admin_apps" = Table.ExpandTableColumn(#"Replaced SYSTEM modifiedBY", "admin_apps", {"admin_appid", "_admin_appenvironment_value", "admin_appownermakerid"}, {"admin_appid", "_admin_appenvironment_value", "admin_appownermakerid"}),
  // Merge apps with maker to get more maker details, such as maker display name, department, is orphaned etc for furhter logic (is the app orphaned)
  #"Merged with makers" = Table.NestedJoin(#"Expanded admin_apps", {"admin_appownermakerid"}, admin_makers, {"admin_makerid"}, "admin_makers", JoinKind.LeftOuter),
  #"Expanded maker table" = Table.ExpandTableColumn(#"Merged with makers", "admin_makers", {"admin_makerid", "admin_userprincipalname", "admin_makerisorphaned", "admin_department", "admin_displayname", "admin_company", "admin_city", "admin_country"}, {"admin_ownermakerid", "admin_userprincipalname", "admin_makerisorphaned", "admin_department", "admin_displayname", "admin_company", "admin_city", "admin_country"}),
  // The table may hold stale data, add utcNow as queried on to be able to identify when the dataflow last queried a record.
  #"Add QueriedOn" = Table.TransformColumnTypes(Table.AddColumn(#"Expanded maker table", "QueriedOn", each DateTimeZone.UtcNow()), {{"QueriedOn", type datetimezone}}),
  #"Inserted conditional column" = Table.AddColumn(#"Add QueriedOn", "markDeleted", each if [appId] <> null then false else true),
  #"Inserted conditional column 1" = Table.AddColumn(#"Inserted conditional column", "markDeletedDate", each if [markDeleted] = "Yes" then [QueriedOn] else null),
  // Add yes/no text field to identify if the app is owned by a maker that no longer exists in the org
  #"Add IsOrphan" = Table.AddColumn(#"Inserted conditional column 1", "IsOrphanText", each if [admin_makerisorphaned] = false then "No" else if [admin_makerisorphaned] = true then "Yes" else null, type text),
  // Merge environment and app id into one text field as the most unique identifier for an app in a tenant
  #"Merge environmentid and appid" = Table.AddColumn(#"Add IsOrphan", "AppUniqueName", each Text.Combine({[EnvironmentNameFromTable], "_", [appId]}), type text),
  #"Removed Content binary column" = Table.RemoveColumns(#"Merge environmentid and appid", {"Content"}),
  #"Changed column type" = Table.TransformColumnTypes(#"Removed Content binary column", {{"EnvironmentNameFromTable", type text}, {"admin_environmentid", type text}, {"admin_environmentdisplayname", type text}, {"appId", type text}, {"environmentId", type text}, {"subType", type text}, {"DocumentVersion", type text}, {"name", type text}, {"description", type text}, {"uri", type text}, {"tenantId", type text}, {"lifecycleState", type text}, {"owner", type text}, {"createdTime", type datetime}, {"createdPrincipalId", type text}, {"lastModifiedTime", type datetime}, {"lastModifiedPrincipalId", type text}, {"lastEnabledTime", type datetime}, {"lastEnabledPrincipalId", type text}, {"sharedUsers", Int64.Type}, {"sharedGroups", Int64.Type}, {"PowerApps Type", type text}, {"admin_appownermakerid", type text}, {"admin_ownermakerid", type text}, {"admin_userprincipalname", type text}, {"admin_makerisorphaned", type logical}, {"admin_department", type text}, {"admin_displayname", type text}, {"admin_company", type text}, {"admin_city", type any}, {"admin_country", type any}, {"QueriedOn", type datetimezone}, {"IsOrphanText", type text}, {"AppUniqueName", type text}, {"markDeleted", type logical}, {"markDeletedDate", type datetimezone}, {"usesCustomApi", type text}, {"usesPremiumApi", type text}, {"usesOnPremiseGateway", type text}}),
  #"Remove MDAs" = Table.SelectRows(#"Changed column type", each [subType] <> "AppModule")
in
    #"Remove MDAs"
nasilvae commented 8 months ago

Hey Jenefer,

We get the following error at the following step: image Expression.Error: There weren't enough elements in the enumeration to complete the operation. Details Reason = Expression.Error Detail = #table({"admin_jobtitle", "admin_displayname", "admin_userprincipalname", "admin_stateorprovince", "admin_makerid", "admin_useremail", "admin_accountenabled", "admin_recordguidasstring", "admin_userisserviceprinciple", "admin_office", "admin_preferredlanguage", "admin_department", "admin_city", "admin_company", "admin_country", "admin_makerisorphaned"}, {})

When we try the modified script it works but this one doesnt. I believe this error is because of the null values from the maker table which isnt complete because the cloud flow hasnt finished running. The modified query accounts for this issue.

nasilvae commented 8 months ago

Hey Jenefer,

We made a copy of the apps dataflow modified query and changed the exclude mda filter condition to now exclude everything except mdas. We then used the dataflow id of that new dataflow and replaced the mda dataflow id environment variable so its part of the refresh workflow.

We now have mda apps inventoried. All dataflows are working except for the original mda. Our Maker cloud flow is still running (1.5 days running) account is not used in any other solutions. We noticed that we have multiple apps that do not have the owner information in our reports but we suspect its because of the maker byodl cloudflow not complete.

image

Jenefer-Monroe commented 8 months ago

Chatted offline. Appears this is really about the on-complete flow for the Maker dataflow is running forever and so we dont get the needed makers here. Put this filter (admin_makerisorphaned eq true or admin_displayname eq 'Unknown') and createdon gt '@{formatDateTime(addDays(utcNow(), -1), 'yyyy-MM-dd')}'

Here image

nasilvae commented 8 months ago

Happy to report CoE is running on our side image

We still suspect that there might be some incomplete data given all of the extra customizations that we did on our side but we are hoping the cleanup flows cleans it up. Next week we will try to run the dataflows with the regular scripts. Will let you know if anything changes. Thank you again.

nasilvae commented 8 months ago

We noticed that the data is not being cleaned up. https://github.com/microsoft/coe-starter-kit/issues/7740