Closed pthatte1-bb closed 3 weeks ago
In-memory DataFrames can be created successfully, but cannot be used in dataframe joins.
Snippet of supported feature:
df_customer = get_customer_database(spark_session) df_temp = spark_session.createDataFrame([(131074, 'Alice'), (131075, 'Bob')], ['join_custkey', 'name']) df_temp.join(df_customer, on=col("c_custkey").eqNullSafe(col("join_custkey"))).drop("join_custkey").show()
Snippet of requested feature:
df_customer = get_customer_database(spark_session) df_temp = spark_session.createDataFrame([(131074, 'Alice'), (131075, 'Bob')], ['c_custkey', 'name']) df_temp.join(df_customer, on="c_custkey").show()
It turns out the feature missing here has nothing to do with virtual tables -- the on column name feature was not implemented.
In-memory DataFrames can be created successfully, but cannot be used in dataframe joins.
Snippet of supported feature:
Snippet of requested feature: