Nike-Inc / brickflow

Pythonic Programming Framework to orchestrate jobs in Databricks Workflow
https://engineering.nike.com/brickflow/
Apache License 2.0
187 stars 41 forks source link

[FEATURE] Box Operator to load data from Box to Volumes and Volumes to Box #137

Closed madhusudan3 closed 1 month ago

madhusudan3 commented 3 months ago

Is your feature request related to a problem? Please describe. The Box Operator needs to provide functionality for downloading files from a Box folder to a Unity Catalog volume and uploading files from a Unity Catalog volume to a Box folder.

Cloud Information

Describe the solution you'd like The solution should include an operator that can perform the following actions: 1.⁠ ⁠Download files from a specified Box folder and store them in a designated Unity Catalog volume. 2.⁠ ⁠Upload files from a specified Unity Catalog volume to a designated Box folder. 3.⁠ ⁠Support authentication and authorization mechanisms to ensure secure data transfer. 4.⁠ ⁠Provide logging and error handling to monitor the transfer process and handle failures gracefully.

Describe alternatives you've considered 1.⁠ ⁠Manual download and upload of files between Box and Unity Catalog, which is time-consuming and error-prone. 2.⁠ ⁠Using third-party integration tools, which may not offer the required flexibility and control over the data transfer process.

Additional context This feature will streamline the data transfer process between Box and Unity Catalog, reducing manual effort and minimizing the risk of errors. It will be particularly useful for teams that frequently need to move data between these platforms for analysis and storage purposes.

pariksheet commented 3 months ago

Primary goal of brickflow is orchestration. Box integration or any such integrations is part of koheesio.

e.g. https://github.com/Nike-Inc/koheesio/blob/main/src/koheesio/integrations/box.py

madhusudan3 commented 2 months ago

Hi @pariksheet this is not Box Integration, we are trying to load files or folders from Box to Databricks Unity Catalog volumes and vise versa. I am almost through testing, I will submit the PR request early next week.

madhusudan3 commented 1 month ago

Merged the PR, closing this Issue. https://github.com/Nike-Inc/brickflow/pull/144