Closed sa1sen closed 12 months ago
We don't have support for this in the python SDK, however we have different tools that might satisfy your case.
Has a way to mass import files from storage (up to 10,000 files) with a friendly interface.
We also have the LightIngest tool - https://github.com/Azure/Kusto-Lightingest
Which can be used to ingest from folders and storages, the dataexplorer web app can also generate a command line for Lightingest (pick historical data).
For now I will close the issue as it doesn't relate to the python SDK specifically.
I have been going through the Python SDK and Ingest command. It looks like you can only ingest at a file level.
I have a problem where I have a parent folder in ADLS Gen 2 with n number of files (no fixed number) which follow the same schema. E.g.
ADLS Gen2 ---container-curated -------parent1 ------------file1.parquet ------------file2.parquet ------------file3.parquet
-------parent2 ------------file4.parquet ------------file5.parquet ------------file6.parquet
At the moment I need to use Azure Storage Account SDK to iterate through the folder and trigger ingest commands. For me the most important thing is to be able to monitor success and failure at parent Folder level. I do not have any mappings stored between parent folder and the individual files. Is there any other way to monitor progress at parent folder level?
I have also tried replicating the above problem in ADX itself using the .ingest into command (Ingest from storage)
Again, this is documentation is more at file level, however, for 'SourceDataLocator' as part of this command I found the documentation says: you can specify the Storage Connection string to something below (Storage Connection):
https://StorageAccountName.dfs.core.windows.net/Filesystem[/PathToDirectoryOrFile]
But I am unable to execute this .ingest into command pointing to a Directory/Folder level. File level it works...
Any suggestions/help?