Open prrao87 opened 4 days ago
In the COPY pipeline, because we already know the number of columns in the input prior to importing the data, can we implicitly infer the column names so that we can use the much simpler DDL command below?
I think it can introduce some confusing behaviours if we implicitly infer column names, such as what if column names don't exactly match? or when there are no header information in the source? In that sense, COPY Product(name, price) FROM 'source'
is much less prone to confusions I think.
But let's do more survey on this and see how other systems handle such cases before we jump to a conclusion.
API
Other
Description
I have this scenario where I have a CSV/Parquet file with just two columns:
In my DDL, I want to add a nullable column
historical_price
as follows:I want to initialize the node table with all values in the
historical_sales
columns as nulls, and add them viaMERGE
at a later time when they become available.Issue
Because my input file has just 2 columns, and my DDL specifies 3 columns, I cannot use
COPY Product FROM 'product.parquet'
directly.I instead have to do this:
Feature request
In the
COPY
pipeline, because we already know the number of columns in the input prior to importing the data, can we implicitly infer the column names so that we can use the much simpler DDL command below? It would reduce the mental burden on the user, as it's expected that the columns that are absent in the input file would have to be filled with nulls.@ray6080 I think this seems like a reasonable feature, but if you think it's infeasible, feel free to close.