NVIDIA / spark-rapids-examples

A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.
Apache License 2.0
129 stars 51 forks source link

Databricks notebooks should translate `dbfs:/` paths to `/dbfs/` formats automatically #404

Closed parthosa closed 4 months ago

parthosa commented 4 months ago

Since the databricks notebooks only support event logs in the file API format (i.e. /dbfs/), we should automatically convert dbfs:/ paths to /dbfs/ for better user experience.

tgravescs commented 4 months ago

what databricks notebook and what is the context here?

parthosa commented 4 months ago

We have databricks notebooks for running qualification/profiling tool in this repo - Link

Although mentioned in the notebook, some users might not be using /dbfs/<path-to-eventlogs> and instead pass the path as dbfs:/<path-to-eventlogs> (for example, copying the path from somewhere else).

Hence this feature was requested so that the users can run the notebook even with dbfs:/<path-to-eventlogs>