Hi,
I found the following to ways to list files in Azure Data Lake Storage Gen2 (Azure Storage Account with hierarchical namespaces):
SELECT file FROM glob("abfss://<container>@<storage_Account>.dfs.core.windows.net/*/*/*/_delta_log/_last_checkpoint");
SELECT size, filename, last_modified FROM read_blob('abfss://<container>@<storage_Account>.dfs.core.windows.net/*/*/*/_delta_log/_last_checkpoint');
I was wondering if there is a way to list folders.
What I'm trying to achieve:
I'm trying to find all Delta Lake tables in my ADLSGen2. A Folder named _delta_log would be an indication that the parent folder represents a Delta Lake table:
My workaround (see above) is to look for the _last_checkpoint files, But these are not always present. Looking for _delta_log/*.json does work reliably but it is very slow if there are many Delta Lake tables with many versions.
Hi, I found the following to ways to list files in Azure Data Lake Storage Gen2 (Azure Storage Account with hierarchical namespaces):
I was wondering if there is a way to list folders.
What I'm trying to achieve:
I'm trying to find all Delta Lake tables in my ADLSGen2. A Folder named
_delta_log
would be an indication that the parent folder represents a Delta Lake table:My workaround (see above) is to look for the
_last_checkpoint
files, But these are not always present. Looking for_delta_log/*.json
does work reliably but it is very slow if there are many Delta Lake tables with many versions.