API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
While maybe not the best practice, databricks does allow columns to have "." in the name. When doing a resample with interporlate, this results in a cannot resolve column name error
AnalysisException: Cannot resolve column name "Bundler.Status.CurMachSpeed" among (site, line, ts, Bundler.Status.CurMachSpeed, Bundler.Status.MachSpeed, agg_key); did you mean to quote the Bundler.Status.CurMachSpeed column?
Workaround
Rename the columns before resampling / interpolate
ISSUE
While maybe not the best practice, databricks does allow columns to have "." in the name. When doing a resample with interporlate, this results in a
cannot resolve column name error
How to reproduce
Create a TSDF with columns that include a "."
Attempt to resample and interporlate with
An error is produced
AnalysisException: Cannot resolve column name "Bundler.Status.CurMachSpeed" among (site, line, ts, Bundler.Status.CurMachSpeed, Bundler.Status.MachSpeed, agg_key); did you mean to quote the
Bundler.Status.CurMachSpeed
column?Workaround
Rename the columns before resampling / interpolate