Background
The _spark_metadata folder contains absolute paths to parquet files. Therefore, when the folder is moved (e.g. to S3), these folders are invalid.
A log file can look like this
Proposed solution
A tool should be written (possibly independent from Hyperdrive), which fixes the paths in the _spark_metadata folder based on the location of the _spark_metadata folder. I.e. in the above example, if the folder sits at s3://some_bucket/folderA/_spark_metadata, then the log file should be changed to
Background The
_spark_metadata
folder contains absolute paths to parquet files. Therefore, when the folder is moved (e.g. to S3), these folders are invalid. A log file can look like thisProposed solution A tool should be written (possibly independent from Hyperdrive), which fixes the paths in the
_spark_metadata
folder based on the location of the_spark_metadata
folder. I.e. in the above example, if the folder sits ats3://some_bucket/folderA/_spark_metadata
, then the log file should be changed toHint