exasol / azure-data-lake-storage-gen2-document-files-virtual-schema

Virtual Schema for document files on Azure Data Lake Storage Gen2
MIT License
0 stars 1 forks source link

Does the existing azure blob storage schema work on azure-data-lake-storage-gen2 and is it adequate? #1

Closed pj-spoelders closed 2 years ago

pj-spoelders commented 2 years ago

Try running the azure blob storage VS (and tests) on an azure-data-lake-storage-gen2 enabled storage account Also see https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-known-issues#blob-storage-apis, especially the first point

pj-spoelders commented 2 years ago

The only thing that had to be changed for the existing tests to work is the container clean() function .. The next thing to test is whether I have access to a file uploaded via azure data lake storage gen 2 api ... the documentation is unclear there. It's mentioned here that it should be: https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-known-issues#blob-storage-apis I think the first point hints to gradual block/part uploads. Nonetheless, got to try it out.

pj-spoelders commented 2 years ago

It appears it's not adequate: Further tests showed we run into issues with what we're trying to do (asynchronously access parts of files). There's 'visibility' issues when using the blob storage API, whether we manually upload a file or use the data lake api to do so. Downloading the whole file in itself seems to work but in this case that's not enough.