Use of Directory Names (Hierarchical Namespace) for Azure Data Lake

Bertverbeek4PS / bc2adls

Exporting data from Dynamics 365 Business Central to Azure data lake storage or MS Fabric lakehouse

MIT License

49 stars 20 forks source link

Use of Directory Names (Hierarchical Namespace) for Azure Data Lake #163

Closed benmarriott-TVT closed 1 month ago

benmarriott-TVT commented 1 month ago

Is it possible, or is there a reason why it's not been implemented, to use Directory Names (hierarchical namespace) in addition to Containers? https://learn.microsoft.com/en-us/rest/api/storageservices/Naming-and-Referencing-Containers--Blobs--and-Metadata#directory-names

So e.g. Container = BC2ADLS Directory Name = Raw Result = BC2ADLS/Raw

Thanks

Ben

Bertverbeek4PS commented 1 month ago

Hi @benmarriott-TVT Why do you want to have hierachical namespaces? It is now already in folders the .parquet files and .csv files. And if you import it through the CDM structure there is no need for that. That was the main purpose to consume it through the CDM structure the processed files.

benmarriott-TVT commented 1 month ago

Hi @Bertverbeek4PS A client (part of a larger group) have their ADLS set up with a 'raw/company' structure (e.g. raw/companyA , raw/company B etc.) I'm not very familiar with ADLS but they've posed the question. Thanks

Bertverbeek4PS commented 1 month ago

Well with bc2adls the company is inside the table export in that way you combine or divide it. Also you have some tables that are multi company. In that way you have to export it for each company. So not sure why they want it. But it is by design.

benmarriott-TVT commented 1 month ago

Thanks. It's actually that their data lake is not just BC companies data. So raw/companyA could be from BC but raw/companyB could be from SAP or ANOther system. They have a 'raw' lake split by the source companies/systems.

Bertverbeek4PS commented 1 month ago

Ah ok. Well what you can do it to export it and in the pipelines you will extract it to the right folders. Because you need to combine the delta's of BC and Datalake. And then you can at the end move it to another folder in the datalake with a modification of the synapse pipelines.