tayganr / purviewdemo

Azure Purview Demo Generator
https://aka.ms/pvdemo
MIT License
54 stars 43 forks source link

Classifiable Sample Data #8

Open JWStarkie opened 3 years ago

JWStarkie commented 3 years ago

Current State

Limitations

Request

tayganr commented 3 years ago

@JWStarkie - The AdventureWorks sample used in the Azure SQL Database and the Bing CoronavirusQuerySet used for the ADLS Gen2 account should result in some classified assets. Is your deployment not reflecting the same? image

JWStarkie commented 3 years ago

@tayganr - It didn't come up when testing yesterday evening. I suppose it needs more time to process. I've got the following assets that have been classified. image Perhaps I can adjust this issue to increase the number of data assets instead?

tayganr commented 3 years ago

+1 on needing better/more sample data (particularly for the data lake). Preferably, if there are open data sets available in other GitHub repositories, we can simply reference them (as opposed to copying data across to keep this repo light).

Example sample code below being used in the postDeploymentScript.ps1 file to load the Microsoft Bing Coronavirus Data Set.

# 9. Load Storage Account with Sample Data
$containerName = "bing"
$storageAccount = Get-AzStorageAccount -ResourceGroupName $resource_group -Name $storage_account_name
$RepoUrl = 'https://api.github.com/repos/microsoft/BingCoronavirusQuerySet/zipball/master'
Invoke-RestMethod -Uri $RepoUrl -OutFile "${containerName}.zip"
Expand-Archive -Path "${containerName}.zip"
Set-Location -Path "${containerName}"
Get-ChildItem -File -Recurse | Set-AzStorageBlobContent -Container ${containerName} -Context $storageAccount.Context

FYI: Syntax to convert a GitHub repo into a downloadable zip file: https://api.github.com/repos/{repoOwner}/{repoName}/zipball/master