PlagueHO / CosmosDB

PowerShell Module for working with Azure Cosmos DB databases, collections, documents, attachments, offers, users, permissions, triggers, stored procedures and user defined functions.
http://dscottraynsford.com
MIT License
154 stars 46 forks source link

Getting all document data #382

Closed tamusjroyce closed 1 year ago

tamusjroyce commented 4 years ago

Issue

Following documentation and getting all documents in a collection, each document is missing fields (until appending)

Proposed Solution

Update documentation to show examples of using ... | ConvertTo-Json | ConvertFrom-Json where appropriate

Below has a few examples of using ... | ConvertTo-Json | ConvertFrom-Json

For reference only: Too much code to be on the readme:

$url = Read-Host -Prompt "Enter the Cosmos URL"
$port = Read-Host -Prompt "Enter the Cosmos Port" 
$key = Read-Host -Prompt "Enter the Cosmos key" -AsSecureString
$databaseName = Read-Host -Prompt "Enter the database name"
$containerName = Read-Host -Prompt "Enter the container name"

$backoffPolicy = New-CosmosDbBackoffPolicy -MaxRetries 20
$primaryKey = ConvertTo-SecureString -String $key.Trim() -AsPlainText -Force
if ($usesEmulator -eq $true) {
    $cosmosDbContext = New-CosmosDbContext -Emulator -Database $databaseName -URI $url -PORT $port -Key $primaryKey -BackoffPolicy $backoffPolicy
} else {
    $sourceCosmosDbContext = New-CosmosDbContext -Database $databaseName -URI $url -PORT $port -Key $sourceKey -BackoffPolicy $backoffPolicy
}

$collection = Get-CosmosDbCollection -Context $cosmosDbContext -Id $containerName
$collectionId = Get-CosmosDbCollectionResourcePath -Database $databaseName -Id $containerName
$partitionKey = ($collection | ConvertTo-Json | ConvertFrom-Json | % { $_.partitionKey.paths[0] }).TrimStart('/')

$documentsPerRequest = 20
$continuationToken = $null
$allDocumentHandles = $null

do {
    $responseHeader = $null
    $getCosmosDbDocumentParameters = @{
        Context = $cosmosDbContext
        CollectionId = 'MyNewCollection'
        MaxItemCount = $documentsPerRequest
        ResponseHeader = ([ref] $responseHeader)
    }

    if ($continuationToken) {
        $getCosmosDbDocumentParameters.ContinuationToken = $continuationToken
    }

    $documentHandles = Get-CosmosDbDocument @getCosmosDbDocumentParameters
    $allDocumentHandles += $documentHandles

    $documentHandles | Foreach-Object {
        # Format of output is different - both formats are helpful for different uses (please keep $_ same format)
        Write-Output $_
        Write-Output '----------------------------'
        $document = ($_ | ConvertTo-Json | ConvertFrom-Json)
        Write-Output $document

        # Now if you want to query all documents in the same partition, you have partition value to do so!
        Write-Output 'Partition Value'
        $partitionValue = (Invoke-Expression "`$document.$partitionKey")
        Write-Output $partitionValue
    }

    $continuationToken = Get-CosmosDbContinuationToken -ResponseHeader $responseHeader
} while (-not [System.String]::IsNullOrEmpty($continuationToken))

For reference only: Too much code to be on the readme

Environment

Before submitting your issue for the CosmosDB project, please take a moment to provide the following details:

Call Out

Last issue from me, I promise!

Just documentation. But it was something I was briefly stuck on. And found the solution through reading open issues.

tamusjroyce commented 4 years ago

Full Sample

Sample that copies between collections, allowing you to programatically change the partitionKey value

It encourages copying from Azure to the Emulator, and only copying to empty collections

https://docs.microsoft.com/en-us/azure/cosmos-db/synthetic-partition-keys explains why you would want to change the partitionKey values

if (Get-Module -Name AzureRM -ListAvailable) {
    Write-Warning -Message 'Az module not installed. Having both the AzureRM and Az modules installed at the same time is not supported.'
    foreach ($module in (Get-Module -ListAvailable AzureRM*).Name |Get-Unique) {
       write-host "Removing Module $module"
       Uninstall-module $module
    }
}

if (-Not (Get-Module -Name Az) -And -Not (Get-Module -Name AzureRm -ListAvailable)) {
    Write-Output 'If the below does not work, try enabling scripting or downloading Az module manually'
    Write-Output 'https://docs.microsoft.com/en-us/powershell/azure/install-az-ps-msi?view=azps-4.1.0'

    Install-Module -Name PowerShellGet -Force
    Install-Module -Name Az -AllowClobber #-Scope CurrentUser
    Uninstall-AzureRM
    Enable-AzureRmAlias -Scope LocalMachine
}

if (-Not (Get-MOdule -Name CosmosDB)) {
    Install-Module -Name CosmosDB -RequiredVersion 4.1.0
}

Import-Module Az
Import-Module CosmosDB

### - The below is only useful if you are programatically creating databases and containers - Arm Templates seems like a better tool - ########################
## There is no need to choose between subscriptions, however above is needed since CosmosDB module depends on Az                                              #
#try {                                                                                                                                                        #
#    $subscriptions = Get-AzSubscription                                                                                                                      #
#} catch {                                                                                                                                                    #
#    Connect-AzAccount                                                                                                                                        #
#    $subscriptions = Get-AzSubscription                                                                                                                      #
#}                                                                                                                                                            #
#                                                                                                                                                             #
#$options = $subscriptions | % { New-Object System.Management.Automation.Host.ChoiceDescription ('&' + $_.Name), ($_.Name + ' (' + $_.SubscriptionId + ')') } #
#                                                                                                                                                             #
#$title = 'Azure Subscription'                                                                                                                                #
#$message = 'Which Azure Subscription?'                                                                                                                       #
#$index = $host.ui.PromptForChoice($title, $message, $options, 0)                                                                                             #
#                                                                                                                                                             #
#$selectedSubscriptionId = $subscriptions[$index] | % { $_.SubscriptionId }                                                                                   #
#$selectedSubscriptionContext = Get-AzSubscription -SubscriptionId $selectedSubscriptionId                                                                    #
#Set-AzContext -SubscriptionId "$selectedSubscriptionId"                                                                                                      #
#$defaultSubscriptionContext = Get-AzSubscription -Default                                                                                                    #
#                                                                                                                                                             #
#$cosmosResourceGroup = Read-Host -Prompt 'Please enter the Cosmos resource group name: '                                                                     #
#                                                                                                                                                             #
#Get-AzCosmosDBAccount -ResourceGroupName $cosmosResourceGroup                                                                                                #
#Get-AzCosmosDBAccountKey -ResourceGroupName $cosmosResourceGroup -Type "ConnectionStrings"                                                                   #
###############################################################################################################################################################

$sourceUsesEmulator = ((Read-Host -Prompt "Use Azure Cosmos Emulator for source? [y/N]") -like "y")
if ($sourceUsesEmulator) {
    $cosmosEmulatorPath = Get-Package | Where-Object{ $_.Name -like "Azure Cosmos DB Emulator" } | % { $_.Source }
    if ([string]::IsNullOrEmpty($cosmosEmulatorPath)) {
        Start "https://aka.ms/cosmosdb-emulator"
    }

    while ([string]::IsNullOrEmpty($cosmosEmulatorPath)) {
        if ((Read-Host -Prompt "Has Azure Cosmos Emulator been downloaded and installed? [Press enter to check or y to skip]") -like "y") {
            break
        }
        $cosmosEmulatorPath = Get-Package | Where-Object{ $_.Name -like "*Cosmos*" } | % { $_.Source }
    }

    Start -FilePath (Join-Path -Path $cosmosEmulatorPath -ChildPath "Microsoft.Azure.Cosmos.Emulator.exe")
    Start "https://localhost:8081/_explorer/index.html"
    Write-Output "CosmosDB should be open with the key showing"
}
if (-not $sourceUsesEmulator) {
    Start "https://portal.azure.com/#blade/HubsExtension/BrowseResource/resourceType/Microsoft.DocumentDb%2FdatabaseAccounts"
    Write-Output "Please navigate to the CosmosDB resource and open Keys under settings and select Read-only keys"
}
$sourceUrl = Read-Host -Prompt "Enter the source Cosmos URL"
$sourceKey = Read-Host -Prompt "Enter the source Cosmos key" -AsSecureString
$sourceDatabaseName = Read-Host -Prompt "Enter the source database name"
$sourceContainerName = Read-Host -Prompt "Enter the source container name"

# Work-around
$sourceUrl = $sourceUrl.Replace("https://", "")
$sourcePort = $sourceUrl.split(':')[1].Replace('/', '')
$sourceUrl = $sourceUrl.Replace("https://", "").split(':')[0]

Write-Output ''

$destinationUsesEmulator = (-not ((Read-Host -Prompt "Use Azure Cosmos Emulator for destination? [Y/n]") -like "y"))
if ($sourceUsesEmulator -ne $destinationUsesEmulator) {
    if ($destinationUsesEmulator) {
        $cosmosEmulatorPath = Get-Package | Where-Object{ $_.Name -like "Azure Cosmos DB Emulator" } | % { $_.Source }
        if ([string]::IsNullOrEmpty($cosmosEmulatorPath)) {
            Start "https://aka.ms/cosmosdb-emulator"
        }

        while ([string]::IsNullOrEmpty($cosmosEmulatorPath)) {
            if ((Read-Host -Prompt "Has Azure Cosmos Emulator been downloaded and installed? [Press enter to check or y to skip]") -like "y") {
                break
            }
            $cosmosEmulatorPath = Get-Package | Where-Object{ $_.Name -like "*Cosmos*" } | % { $_.Source }
        }

        Start -FilePath (Join-Path -Path $cosmosEmulatorPath -ChildPath "Microsoft.Azure.Cosmos.Emulator.exe")
        Start "https://localhost:8081/_explorer/index.html"
        Write-Output "CosmosDB should be open with the key showing"
    }
    if (-not $destinationUsesEmulator) {
        Start "https://portal.azure.com/#blade/HubsExtension/BrowseResource/resourceType/Microsoft.DocumentDb%2FdatabaseAccounts"
        Write-Output "Please navigate to the CosmosDB resource and open Keys under settings and select Read-only keys"
    }
}
if ($sourceUsesEmulator -eq $destinationUsesEmulator) {
    $destinationUrl = Read-Host -Prompt "Enter the destination Cosmos URL [leave blank to use source URL - $sourceUrl]"
    if ([string]::IsNullOrEmpty($destinationUrl)) {
        $destinationUrl = $sourceUrl
    }
    if ($sourceUrl -eq $destinationUrl) {
        $destinationKey = Read-Host -Prompt "Enter the destination Cosmos key [leave blank to use source key - ********]" -AsSecureString
        if ([string]::IsNullOrEmpty($destinationKey)) {
            $destinationKey = $sourceKey
        }
    } else {
        $destinationKey = Read-Host -Prompt "Enter the destination Cosmos key" -AsSecureString
    }
} else {
    $destinationUrl = Read-Host -Prompt "Enter the destination Cosmos URL"
    $destinationKey = Read-Host -Prompt "Enter the destination Cosmos key" -AsSecureString
}
$destinationDatabaseName = Read-Host -Prompt "Enter the destination database name"
$destinationContainerName = Read-Host -Prompt "Enter the destination container name"

# Work-around
$destinationUrl = $destinationUrl.Replace("https://", "")
$destinationPort = $destinationUrl.split(':')[1].Replace('/', '')
$destinationUrl = $destinationUrl.Replace("https://", "").split(':')[0]

Write-Output ''

$overwritePartitionKey = Read-Host -Prompt "Should the partition keys in the destination collection be overwritten? [y/N]"
$overwritePartitionKeyIfMissing = $null
if ($overwritePartitionKey -ne 'y') {
    $overwritePartitionKeyIfMissing = Read-Host -Prompt "Should the partition keys in the destination collection be written when missing? [y/N]"
}
$migrateMissingPartitionKeyAlgorithm = $null
if ($overwritePartitionKey -eq 'y' -Or $overwritePartitionKeyIfMissing -eq 'y') {
    Write-Output 'Valid values include: '
    Write-Output '  o $doc.fieldA                                     - typical path that drills into objects'
    Write-Output ''
    Write-Output '  o $id                                             - id of document (you normally dont want this)'
    Write-Output '  o $sourcePartitionValue                           - represents the source tables partition key - value'
    Write-Output '  o $destinationPartitionValue                      - represents the destination tables partition key - value'
    Write-Output ''
    Write-Output '  o [string]$doc.fieldA + "." + [string]$doc.fieldB - concatinates two objects with a period between them'
    Write-Output '  o $doc.fieldA.fieldArrayB[5].fieldC               - complex path getting value within object'
    Write-Output ''
    Write-Output 'Note: If the syntax fails, `@sourcePartitionKey + @destinationPartitionKey` will be used'
    Write-Output ''
    Write-Output 'Please list the fields (remember the $doc.var prefix) to be concatinated together'
    $migrateMissingPartitionKeyAlgorithm = Read-Host -Prompt "  for populating the partition key [leave blank to use @sourcePartitionKey]"
    if ([string]::IsNullOrEmpty($migrateMissingPartionKeyAlgorithm)) {
        $migrateMissingPartionKeyAlgorithmn = '@sourcePartitionKey'
    }
}

Write-Output ''

$backoffPolicy = New-CosmosDbBackoffPolicy -MaxRetries 20
try {
    $primarySourceKey = ConvertTo-SecureString -String $sourceKey.Trim() -AsPlainText -Force
} catch {
    $primarySourceKey = $sourceKey
}
if ($sourceUsesEmulator -eq $true) {
    $sourceCosmosDbContext = New-CosmosDbContext -Emulator -Database $sourceDatabaseName -URI $sourceUrl -PORT $sourcePort -Key $primarySourceKey -BackoffPolicy $backoffPolicy
} else {
    $sourceCosmosDbContext = New-CosmosDbContext -Database $sourceDatabaseName -URI $sourceUrl -PORT $sourcePort -Key $primarySourceKey -BackoffPolicy $backoffPolicy
}

$sourceCollection = Get-CosmosDbCollection -Context $sourceCosmosDbContext -Id $sourceContainerName
$sourceCollectionId = Get-CosmosDbCollectionResourcePath -Database $sourceDatabaseName -Id $sourceContainerName
$sourcePartitionKey = ($sourceCollection | ConvertTo-Json | ConvertFrom-Json | % { $_.partitionKey.paths[0] }).TrimStart('/')

Write-Output ''

try {
    $primaryDestinationKey = ConvertTo-SecureString -String $destinationKey.Trim() -AsPlainText -Force
} catch {
    $primaryDestinationKey = $destinationKey
}
if ($destinationUsesEmulator -eq $true) {
    $destinationCosmosDbContext = New-CosmosDbContext -Emulator -Database $destinationDatabaseName -URI $destinationUrl -PORT $destinationPort -Key $primaryDestinationKey -BackoffPolicy $backoffPolicy
} else {
    $destinationCosmosDbContext = New-CosmosDbContext -Database $destinationDatabaseName -URI $destinationUrl -PORT $destinationPort -Key $primaryDestinationKey -BackoffPolicy $backoffPolicy
}

$destinationCollection = Get-CosmosDbCollection -Context $destinationCosmosDbContext -Id $destinationContainerName
$destinationCollectionId = Get-CosmosDbCollectionResourcePath -Database $destinationDatabaseName -Id $destinationContainerName
$destinationPartitionKey = ($destinationCollection | ConvertTo-Json | ConvertFrom-Json | % { $_.partitionKey.paths[0] }).TrimStart('/')

Write-Output ''

$testDestinationDocuments = Get-CosmosDbDocument -Context $sourceCosmosDbContext -CollectionId $sourceContainerName -MaxItemCount 3 -ResponseHeader ([ref] $ResponseHeader)
$isDestinationContainerEmpty = (-Not $documents -Or $documents.Length -lt 2)
$overrideWriteDestinationNonEmptyContainer = $false

if (-not $isDestinationContainerEmpty) {
    Write-Output "Destination container is not empty!"
    Write-Output "  Writing to a non-empty container is forbidden..."
    Write-Output ""
    Write-Output "Note:  You may override this by entering: Append_To_NonEmpty_$destinationContainerName"
    Write-Output ""
    Write-Output "If you continue, writing to the destination will not occur."
    $watchLog = Read-Host -Prompt "  Would you like to continue and watch console logging of the source documents? [y/N]"
    if ($watchLog -like "Append_To_NonEmpty_$destinationContainerName") {
        $overrideWriteDestinationNonEmptyContainer = $true
    } elseif ($watchLog -like 'y') {
        exit
    }
}}

$documentsPerRequest = 20
$continuationToken = $null
$documents = $null

do {
    $responseHeader = $null
    $getCosmosDbDocumentParameters = @{
        Context = $sourceCosmosDbContext
        CollectionId = $sourceContainerName
        MaxItemCount = $documentsPerRequest
        ResponseHeader = ([ref] $responseHeader)
    }

    if ($continuationToken) {
        $getCosmosDbDocumentParameters.ContinuationToken = $continuationToken
    }

    $documents = Get-CosmosDbDocument @getCosmosDbDocumentParameters
    $continuationToken = Get-CosmosDbContinuationToken -ResponseHeader $responseHeader

    $documents | Foreach-Object {
        $docProperties = $_ | Get-Member -MemberType NoteProperty | ForEach-Object { $_.Name }
        $doc = $_ | Select-Object -Property $docProperties | ConvertTo-Json | ConvertFrom-Json

        $id = $doc.id
        $sourcePartitionValue = Invoke-Expression "`$doc.$sourcePartitionKey"
        $destinationPartitionValue = Invoke-Expression "`$doc.$destinationPartitionKey"
        $newPartitonValue = $destinationPartitionValue

        if ($overwritePartitionKey -eq 'y' -Or (-Not $partitonValue -And $overwritePartitionKeyIfMissing -eq 'y')) {
            try {
                $newPartitonValue = Invoke-Expression "$migrateMissingPartitionKeyAlgorithm"
            } catch {
                $newPartitonValue = [string]$sourcePartitionValue + [string]$destinationPartitionValue
                Write-Output "'" + $migrateMissingPartitionKeyAlgorithm + "' did not work... Error: " + $_
            }
            if (!$doc.PSobject.Properties.Name.Contains($destinationPartitionKey)) {
                $doc | Add-Member -MemberType NoteProperty -Name $destinationPartitionKey -Value $null
            }
            Invoke-Expression "`$doc.$destinationPartitionKey = `$newPartitonValue"
        }

        if ($isDestinationContainerEmpty) {
            New-CosmosDbDocument -Context $destinationCosmosDbContext -CollectionId $destinationContainerName -DocumentBody ($doc | ConvertTo-Json) -PartitionKey $newPartitonValue
        } elseif ($overrideWriteDestinationNonEmptyContainer) {
            # upcert incase the document already exists and just needs updated
            New-CosmosDbDocument -Context $destinationCosmosDbContext -CollectionId $destinationContainerName -DocumentBody ($doc | ConvertTo-Json) -PartitionKey $newPartitonValue -Upsert
        } else {
            Write-Output 'destination collection is not empty - no write performed'
        }
        Write-Output $doc
    }
} while (-not [System.String]::IsNullOrEmpty($continuationToken))
PlagueHO commented 4 years ago

Awesome stuff. Should probably create a samples folder and add these as .ps1 files. Can then link to them in the README.MD.

tamusjroyce commented 4 years ago

Can do. I will try to put both https://github.com/PlagueHO/CosmosDB/issues/275 and this into the same PR when I get the time. And the example could definitely use some cleaning up. Thank you for the approval!

tamusjroyce commented 4 years ago

Just an FYI that I am still planning on getting this done in a future weekend

    $documents = Get-CosmosDbDocument @getCosmosDbDocumentParameters
    $continuationToken = Get-CosmosDbContinuationToken -ResponseHeader $responseHeader

    $documents | Foreach-Object {
        $docProperties = $_ | Get-Member -MemberType NoteProperty | ForEach-Object { $_.Name }
        $doc = $_ | Select-Object -Property $docProperties | ConvertTo-Json | ConvertFrom-Json

I finally found the missing piece! The key is to only consider NoteProperty type members, which filters methods, aliases, and internal fields. I couldn't figure out why extra fields were being appended to my JSON object. Powershell does not make this intuitive. But finally got it!

v-karbovnichy commented 2 years ago

It is wise to note that ConvertTo-Json should also include Depth parameter, otherwise you can easily lose some nested properties that are deeper that 2 levels (right now in my sample doc I have 5).