Closed OnlyImmutable closed 4 months ago
To be clear, clickhouse-connect doesn't do any string parsing of dates during insert. It looks like you are trying to insert a row that contains a string in a date column, and that's what causing the TypeError
.
From your code above, that seems like it's happening in this block of code:
try:
# Try to parse the string as a date
parsed_date = parser.parse(value).date()
escaped_row.append(parsed_date.strftime("%Y-%m-%d"))
except ValueError:
# If parsing fails, you can handle it as needed.
# Here, we are appending the original string.
escaped_row.append(str(value))
As the comment says, "you can handle . . . as needed" what happens when parsing the string value into a date fails. What's needed in this case is adding either a valid date value or None
, not just adding the bad string anyway.
The simplest fix is to that the code above to escaped_row.append(None)
, instead of escape_row.append(str(value))
, but you may want to log the bad string that's coming in.
Then what format should it be in? Because I've tried inserting a string, a date, in multiple formats?
The column is a date, i want to insert a date as expected so what do I do to solve it? I can't just enter nothing?
To be clear, clickhouse-connect doesn't do any string parsing of dates during insert. It looks like you are trying to insert a row that contains a string in a date column, and that's what causing the
TypeError
.From your code above, that seems like it's happening in this block of code:
try: # Try to parse the string as a date parsed_date = parser.parse(value).date() escaped_row.append(parsed_date.strftime("%Y-%m-%d")) except ValueError: # If parsing fails, you can handle it as needed. # Here, we are appending the original string. escaped_row.append(str(value))
As the comment says, "you can handle . . . as needed" what happens when parsing the string value into a date fails. What's needed in this case is adding either a valid date value or
None
, not just adding the bad string anyway.The simplest fix is to that the code above to
escaped_row.append(None)
, instead ofescape_row.append(str(value))
, but you may want to log the bad string that's coming in.
As you can see here, I switched it to insert a date object instead and I get this exception, so clearly there is something going on? The point of the try and catch is purely to access if a date comes in as a string for whatever reason, which is fairly likely. If its not a date we want to insert it as a string, I just need help getting dates to insert properly?
async def insert_data_into_clickhouse(db, table_name, column_names, select_batch_size=10000):
# Get the total number of rows
total_rows_query = f"SELECT COUNT(*) FROM {table_name.lower()};"
total_rows = (await db.execute(text(total_rows_query))).scalar()
for offset in range(0, total_rows, select_batch_size):
select_query = f"SELECT * FROM {table_name.lower()} OFFSET {offset} LIMIT {select_batch_size};"
results = (await db.execute(text(select_query))).fetchall()
print("Offset:", offset, "Batch Size:", len(results))
# Prepare the data for insertion
data_values = []
for row in results:
# Handle None values for nullable columns and convert date objects to strings
escaped_row = []
for value in row:
if value is None:
escaped_row.append(None)
elif isinstance(value, (datetime.datetime, datetime.date)):
escaped_row.append(value.strftime("%Y-%m-%d"))
elif isinstance(value, str) and value:
# Handle date that is passed as string
try:
# Try to parse the string as a date
parsed_date = parser.parse(value).date()
escaped_row.append(parsed_date)
except ValueError:
# If parsing fails, you can handle it as needed.
# Here, we are appending the original string.
escaped_row.append(str(value))
else:
escaped_row.append(value)
# print("Original Row:", row)
# print("Escaped Row:", escaped_row)
data_values.append(escaped_row)
# Print information for debugging
# print("Column Names:", column_names)
# print("Data Values:", data_values)
# Use the insert method with column_names parameter
client.insert(table=table_name, data=data_values, column_names=column_names)
print(f"Inserted into {table_name}... batch offset: {offset}... batch size: {len(results)}")
print()
print("Data insertion completed.")
Error: object of type 'datetime.date' has no len()
Process finished with exit code 0
Describe the bug
I have written a script to insert all my data from Postgres to Clickhouse, that being said, all data inserts fine usually, but when I try to insert a date, I get an exception within the clickhouse-connect library in Python.
Expected behaviour
The data inserts fine without the library throwing an exception...
Code example
Here is an example of the data_values before insert...
clickhouse-connect and/or ClickHouse server logs
Configuration
Environment
ClickHouse server
CREATE TABLE
statements for tables involved: