Open rafagsiqueira opened 1 year ago
A workaround is downloading the blob using Bearer Token authentication and a different httpclient:
public downloadBlob(container: string, blobName: string): Observable<any> {
return of(this.blobServiceClient.getContainerClient(container)).pipe(
map((containerClient: ContainerClient) => containerClient.getBlockBlobClient(blobName.toLowerCase())),
map((blobClient: BlobClient) => blobClient.url),
concatMap((url: string) => from(this.http.get(url, {responseType: 'blob', headers: {'x-ms-version': '2021-08-06'}}))),
map((blob) => ({blob, name: blobName, type: blob.type})),
catchError((err) => {
console.error(err);
return of(null);
})
)
}
The x-ms-version header has to be present, for Bearer Token authentication to work. The bearer token was injected into the httpclient. This sample is from Angular code.
@rafagsiqueira if I am correct in understanding you, you are seeing the download URL have a query parameter appended to it named _
that has a timestamp value?
This sounds like the logic here: https://github.com/Azure/azure-sdk-for-js/blob/28b2aa281227b2f1b50f68cb1ac332901c353219/sdk/storage/storage-blob/src/policies/StorageBrowserPolicy.ts#L27
Can you give some more details about how you are generating the download URL? Is everything happening in the browser or are you using the SDK server-side as well?
@xirzec you are correct. The download was originally done by the blobclient download method (blobClient.download()). I was not generating the download URL myself. After realizing the files were never retrieved from cache, I inspected the requests on my browser developer console and realized the URL had that timestamp appended, which was preventing the browser from retrieving from cache, regardless of the Cache-Control header. Everything is happening in the browser, this is not on the server-side.
@rafagsiqueira I think you could manage this by tweaking the pipeline a bit to remove the cache-busting policy, perhaps something like:
const pipeline = newPipeline(credential);
const policyIndex = pipeline.factories.findIndex(factory => factory instanceof StorageBrowserPolicyFactory);
if (policyIndex > -1) {
pipeline.factories.splice(policyIndex, 1);
}
const blobServiceClient = new BlobServiceClient(
`https://${account}.blob.core.windows.net`,
pipeline
);
@EmmaZhu perhaps we should have an option in StoragePipelineOptions
to disable the cache busting for browsers?
@xirzec What about this parameter URLConstants.Parameters.FORCE_BROWSER_NO_CACHE? Is the default value true? https://github.com/Azure/azure-sdk-for-js/blob/28b2aa281227b2f1b50f68cb1ac332901c353219/sdk/storage/storage-blob/src/policies/StorageBrowserPolicy.ts#L52
@rafagsiqueira it is just a constant string "_"
, the name of the query parameter that will have the timestamp value
https://github.com/Azure/azure-sdk-for-js/blob/48c3ad1014d04b6cd17d8d3464046430e1118a92/sdk/storage/storage-blob/src/utils/constants.ts#L22
The naming of the constant indicates the _ querystring parameter was created precisely to force the browser not to cache. I agree with your suggestion that there should be a way to disable this querystring parameter.
@xirzec
Echo your suggestion. Seems we'd need to add an option to disable the default behavior.
For a non-public storage account, we are using azure storage sdk to download blobs using AD access token credential. However, every time a call is made to blobClient.download(), the URL of the blob is appended with a _=. This prevents the browser from retrieving the blob from its cache.
Is there an option to avoid appending this to the query string? Perhaps sending it as a header?
I have tried setting CacheControl on the blobs, and also retrieving blobs by specific versionId, but that also results in a URL that is not cached by the browser.