Closed ckunst closed 4 years ago
I attempted to reproduce this unsuccessfully. I created a similar folder structure.
And used the following pipeline (modified the example pipeline):
resource_types:
- name: azure-blobstore
type: docker-image
source:
repository: pcfabr/azure-blobstore-resource
resources:
- name: azure-blobstore-resource
type: git
source:
uri: https://github.com/pivotal-cf/azure-blobstore-resource.git
branch: master
- name: configuration
type: azure-blobstore
source:
storage_account_name: ((storage_account_name))
storage_account_key: ((storage_account_key))
container: ((container))
regexp: file_in_root_a-(.*).txt
jobs:
- name: print-config
plan:
- in_parallel:
- get: azure-blobstore-resource
- get: configuration
- task: print-config
file: azure-blobstore-resource/example/tasks/print-config/task.yml
params:
CONFIGURATION_FILENAME: file_in_root_a-*
- name: write-config
plan:
- in_parallel:
- get: azure-blobstore-resource
- task: write-config
file: azure-blobstore-resource/example/tasks/write-config/task.yml
params:
CONFIGURATION_FILENAME: my-file
APPEND_TIMESTAMP_ON_FILENAME: 1
- put: configuration
params:
CONFIGURATION_FILENAME: file_in_root_a-*
The check ran successfully and so did the job. Let me know if my setup looks off.
Is this consistently failing for you or is it intermittent? Did this use to work at some point or did it never work? Trying to narrow down the possibilities.
Indeed it seems that the subfolder assumption was a red herring. The issue occurred today again after we cleaned the structure to be flat. My current assumption is that it has to do with overwriting files based on the following observation:
Now I paused the download pipeline as I have to be unblocked in order to unblock the stories that were blocked by it. Can you check if you can reproduce the above behavior on your lab? Thank you!
I changed my pipeline to overwrite the file e.g
- name: write-config
plan:
- in_parallel:
- get: azure-blobstore-resource
- task: write-config
file: azure-blobstore-resource/example/tasks/write-config/task.yml
params:
CONFIGURATION_FILENAME: file_in_root_a-1.2.3.txt
- put: configuration
params:
file: configuration/file_in_root_a-1.2.3.txt
Still haven't been able to reproduce the issue.
A couple questions to help potentially narrow this down:
package main
import (
"encoding/json"
"fmt"
"os"
"github.com/Azure/azure-sdk-for-go/storage"
"github.com/pivotal-cf/azure-blobstore-resource/azure"
)
func main() {
storageAccountName := os.Args[1]
storageAccountKey := os.Args[2]
container := os.Args[3]
azureClient := azure.NewClient(
storage.DefaultBaseURL,
storageAccountName,
storageAccountKey,
container,
)
blobs := []storage.Blob{}
marker := ""
for {
blobListResponse, err := azureClient.ListBlobs(storage.ListBlobsParameters{
Include: &storage.IncludeBlobDataset{
Snapshots: true,
Copy: true,
},
Marker: marker,
})
if err != nil {
panic(err)
}
for _, blob := range blobListResponse.Blobs {
blobs = append(blobs, blob)
}
marker = blobListResponse.NextMarker
if marker == "" || len(blobListResponse.Blobs) == 0 {
break
}
}
blobsJson, err := json.Marshal(blobs)
if err != nil {
panic(err)
}
fmt.Println(string(blobsJson))
}
To run:
go run main.go <STORAGE_ACCOUNT_NAME> <STORAGE_ACCOUNT_KEY> <CONTAINER>
It seems indeed that this is not something that is going to be easy to reproduce. Last night the dowload pipeline has overwritten the files and no problems occurred. Regarding the code snipped above, in my initial troubleshooting session I re-compiled the original golang resource with debug messages to trace the return of the call, hijacked the resource and noticed that when the regexp failed the returned json array was empty - that is why I initially wrote that to me it looks like an issue in the API call and/or go library. At the moment I cannot reproduce the issue but I will will update the issue if/when the issue occurs again.
We tracked down the issue with the customer today. It looks like azure is responding with an empty result set but having a next-marker set.
We edited the check.go script to output the variable contents. The "uncheckable" container shows the following output on the first check loop:
blobListResponse: {{ EnumerationResults} 2!124!MDAwMDQ4IVtlbGFzdGljLXJ1bnRpbWUsMi44LjNdY2YtMi44LjMtYnVpbGQuMjAucGl2b3RhbCEwMDAwMjghMTYwMS0wMS0wMVQwMDowMDowMC4wMDAwMDAwWiE- 0 [] [] }
error: <nil>
nextMarker: 2!124!MDAwMDQ4IVtlbGFzdGljLXJ1bnRpbWUsMi44LjNdY2YtMi44LjMtYnVpbGQuMjAucGl2b3RhbCEwMDAwMjghMTYwMS0wMS0wMVQwMDowMDowMC4wMDAwMDAwWiE-
2020/02/14 13:16:27 failed to get latest version from regexp: no matching blob found for regexp: platform-automation-image-(.*).zip
The "checkable" container returns a filled blob list response on the first loop run.
With the "uncheckable" container, the VersionsSinceRegexp function exits on
if marker == "" || len(blobListResponse.Blobs) == 0 {
break
}
To overcome the issue we rewrote the for loop in the function as follows:
func (c Check) VersionsSinceRegexp(expr, currentVersion string) ([]Version, error) {
blobs := []storage.Blob{}
marker := ""
firstRun := "true"
for {
blobListResponse, err := c.azureClient.ListBlobs(storage.ListBlobsParameters{
Include: &storage.IncludeBlobDataset{
Snapshots: true,
Copy: true,
},
Marker: marker,
})
if err != nil {
return []Version{}, err
}
for _, blob := range blobListResponse.Blobs {
blobs = append(blobs, blob)
}
marker = blobListResponse.NextMarker
if marker == "" || (len(blobListResponse.Blobs) == 0 && marker == "" && firstRun == "true") || (len(blobListResponse.Blobs) == 0 && marker != "" && firstRun == "false") {
break
}
firstRun = "false"
}
...
We checked this with a working and a non working container and also set MaxResults to 5 to check if the logic works with multiple nextMarkers set. It would be nice to integrate this.
We did not manage to find a simple way to reproduce this problem, but we get it intermittently by running our download pipeline against an azure container.
Good find, that is interesting. Thanks for taking the time to track this issue down. I'll work on a fix.
Fixed inhttps://github.com/pivotal-cf/azure-blobstore-resource/releases/tag/v0.10.0. Feel free to reopen if the issue persists.
The way to reproduce: Create an Azure container with a structure containing a folder and more than one file inside of the folder in addition to files located in the root of the container. Ex: