Azure / azure-functions-durable-powershell

PowerShell SDK for writing Durable Functions apps
MIT License
7 stars 3 forks source link

Invoke-WebRequest hangs inside of Activity Function #70

Open phill-holbrook opened 9 months ago

phill-holbrook commented 9 months ago

I have a function that uses Invoke-WebRequest in order to pull data from an external API. Here's a sanitzed version of the function:

function Get-APIResource {
    param(
        [Parameter (Mandatory=$true)]$Resource,
        [Parameter (Mandatory=$true)]$SearchQuery
    )

    $BaseURL = ""

    $headers = @{
        ApiCode = $env:APISecret
        UserName = $env:PIUser
        Secret = $env:APIKey
        'Content-Type' = 'application/json'
    }

    $ResourceURL = $BaseURL + $Resource + '/query?search=' + $SearchQuery

    $i = 1

    try {

        do {

            Write-Warning "Getting page $i of $Resource"
            $Response = Invoke-WebRequest -Method GET -Uri $ResourceURL -Headers $headers -TimeoutSec 30 -ErrorAction Stop
            $Response = $Response.Content | ConvertFrom-Json

            If ($Response.items){

                Write-Warning "($i) Retrieved $($Response.items.count) items"

                $Response.items | ForEach-Object{
                    $_
                    #$ReturnList.Add($_)
                }
            }

            If ($Response.item){

                Write-Warning "($i) Retrieved $($Response.item.count) ITEM"

                $Response.item | ForEach-Object{
                    $_
                    #$ReturnList.Add($_)
                }
            }

            $ResourceURL = $Response.pagedetails.nextPageUrl

            If ($null -eq $ResourceURL){
                Write-Warning "($i) ResourceURL is null"
            } Else {
                Write-Warning "($i) Next page URL: $ResourceURL"
            }

            $i++

        } while ($null -ne $ResourceURL)

    }
    catch {
        Write-Error "Connecting to API Failed."
        throw "API Error: $($_.Exception.Message)"    
    }

}

I'm using it in an Activity function, like so:

param($Context)

# Set DataType
$DataType = 'Products'

# Initialize default return object
$returnObject = [PSCustomObject]@{
    $DataType = "Problem accessing data."
}

# Get data
try{
    $filter = @{} | ConvertTo-JSON
    $Data = Get-APIResource -Resource Products -SearchQuery $filter
}
catch{
    Write-Warning "Error retrieving $DataType."
    Write-Error "ERROR: $($Error[0].Exception.Message)"
    If ($Error[0].InnerException){
        Write-Error "INNER ERROR: $($Error[0].Exception.InnerException)"
    }
    throw $returnObject
}

If (-not $Data){
    Write-Warning "`$Data variable is blank. Failed to get $DataType."
    return $returnObject
}

# Return
[PSCustomObject]@{
    $DataType = $Data
} | ConvertTo-Json -Depth 100

And this is being called by an Orchestrator function:

param($Context)

Write-Output "Starting Orchestrator."

$RetryOptions = New-DurableRetryOptions -FirstRetryInterval (New-TimeSpan -Seconds 5) -MaxNumberOfAttempts 3

Write-Output "Retrieving Config Items from Autotask"
$APIData = Invoke-DurableActivity -FunctionName "Func-RetrieveAPIData" -RetryOptions $RetryOptions

# Build object (add metadata) to send to data lake
$SendObject = @{
    APIData = $APIData
    sourceName = "API"
}

# Convert to JSON to prep for sending
$SendJson = $SendObject | ConvertTo-Json -Depth 100

# Send to the data lake function
Write-Output "Sending Data to Data Lake function"
Invoke-DurableActivity -FunctionName "Func-SendtoDataLake" -Input $SendJson

When I look at Log Stream in the Azure Function, I can see the progress of Invoke-WebRequest. When it runs normally, it looks like this:

PROGRESS: Reading response stream... (Number of bytes read: 0)
PROGRESS: Reading response stream... (Number of bytes read: 3276)
PROGRESS: Reading response stream... (Number of bytes read: 13276)
PROGRESS: Reading response stream... (Number of bytes read: 23276)
PROGRESS: Reading response stream... (Number of bytes read: 27864)
...
...
PROGRESS: Reading response stream... (Number of bytes read: 256662)
PROGRESS: Reading web response completed. (Number of bytes read: 256662)

However, when it has a problem, it will halt after a random number of bytes read. Verbose logs will continue to output from normal operation of the Azure Function. The Activity Function times out after 10 minutes, at which point the Orchestrator finishes.

Normally, the API request takes under 30 seconds to complete -- usually it completes in under 10 seconds. But since the web stream from Invoke-WebRequest is hanging the Activity Function up, I cannot implement Durable Retry or Durable Wait in the Orchestrator function. I've tried wrapping each Invoke-WebRequest in Start-ThreadJob within Get-APIResource and using Wait-Job with a 30 second timeout. I'm able to properly throw an error from the API function to the Activity function, and the Orchestrator function catches it, but the Activity Function still sits there and waits until 10 minutes to time out because the thread job is locked up thanks to the Invoke-WebRequest stream.

I cannot reproduce this problem locally (via manual execution of Get-APIResource or via Azure Function debugging in VS Code) or on an Azure VM. I can fairly reliably reproduce this problem within Azure Automation Accounts / Runbooks and Azure Functions.

I've been chasing this error down since at least September 2023. I'm happy to share any other required information to help track this down.

phill-holbrook commented 8 months ago

I think I've found a solution for this by removing Invoke-WebRequest and using System.Net.Http.HttpClient directly.

function Get-APIResource {
    param(
        [Parameter (Mandatory=$true)]$Resource,
        [Parameter (Mandatory=$true)]$SearchQuery
    )

    $BaseURL = ""

    $httpClient = New-Object System.Net.Http.HttpClient
    $httpClient.DefaultRequestHeaders.Add("ApiCode", $env:ApiCode)
    $httpClient.DefaultRequestHeaders.Add("UserName", $env:APIUser)
    $httpClient.DefaultRequestHeaders.Add("APIKey", $env:APIKey)

    $ResourceURL = $BaseURL + $Resource + '/query?search=' + $SearchQuery

    $i = 1

    try {

        do {

            Write-Warning "Getting page $i of $Resource"
            $responseStream = $httpClient.GetStreamAsync($ResourceURL).Result
            $streamReader = [System.IO.StreamReader]::new($responseStream)
            $responseContent = $streamReader.ReadToEnd()
            $Response = $ResponseContent | ConvertFrom-Json

            If ($Response.items){

                Write-Warning "($i) Retrieved $($Response.items.count) items"

                $Response.items | ForEach-Object{
                    $_
                    #$ReturnList.Add($_)
                }
            }

            If ($Response.item){

                Write-Warning "($i) Retrieved $($Response.item.count) ITEM"

                $Response.item | ForEach-Object{
                    $_
                    #$ReturnList.Add($_)
                }
            }

            $ResourceURL = $Response.pagedetails.nextPageUrl

            If ($null -eq $ResourceURL){
                Write-Warning "($i) ResourceURL is null"
            } Else {
                Write-Warning "($i) Next page URL: $ResourceURL"
            }

            $i++

        } while ($null -ne $ResourceURL)

    }
    catch {
        Write-Error "Connecting to API Failed."
        throw "API Error: $($_.Exception.Message)"    
    }

}

I have no idea why I'm unable to reproduce the error calling the class directly. My guess is it's some perfect storm between the language runtime, Invoke-WebRequest, and the remote API in question. Hope this helps someone else.