apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.67k stars 3.27k forks source link

[Bug] Unexpected exception: submit task failed, queue size is full: Mysql Load #37503

Open wulishann33 opened 4 months ago

wulishann33 commented 4 months ago

Search before asking

Version

Doris version:2.0 beta Python version : 3.10

What's Wrong?

I am using the LOAD DATA INFILE method to import local csv files, looping through local directory files and importing them one by one. During the import process, when the data volume of the table reaches 45G the following error occurs:

Traceback (most recent call last):
  File "C:\weather-data-processing\venv\lib\site-packages\mysql\connector\connection_cext.py", line 705, in cmd_query
    self._cmysql.query(
_mysql_connector.MySQLInterfaceError: Unexpected exception: submit task failed, queue size is full: Mysql Load

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "C:\weather-data-processing\getWeatherSanple\process\import_data_test.py", line 48, in <module>
    cursor.execute(sql)
  File "C:\weather-data-processing\venv\lib\site-packages\mysql\connector\cursor_cext.py", line 357, in execute
    result = self._connection.cmd_query(
  File "C:\weather-data-processing\venv\lib\site-packages\mysql\connector\opentelemetry\context_propagation.py", line 97, in wrapper
    return method(cnx, *args, **kwargs)
  File "C:\weather-data-processing\venv\lib\site-packages\mysql\connector\connection_cext.py", line 713, in cmd_query
    raise get_mysql_exception(
mysql.connector.errors.DatabaseError: 1105 (HY000): Unexpected exception: submit task failed, queue size is full: Mysql Load`
I had searched all issues and didn't see any similar problems.

What You Expected?

I want to know why the queue is full and how to solve it. I manually closed the connection every time I created it, and the default expiration time of the queue should be 8 hours. If the queue is blocked, it should have expired. This problem has occurred for three days and the source of the problem has not been found yet.

How to Reproduce?

def insert_dataframe_to_doris(df, connection, file_path):
    connection = reconnect_if_needed(connection)
    cursor = connection.cursor()
    df = preprocess_dataframe(df)
    df.to_csv(file_path, index=False)
    try:
        file_path = file_path.replace('\\', '\\\\')
        sql = rf"""
                LOAD DATA LOCAL 
                INFILE '{file_path}'
                INTO TABLE {doris_table} 
                COLUMNS TERMINATED BY ','
                LINES TERMINATED BY '\n'
                IGNORE 1 LINES
                """
        cursor.execute(sql)
    except Exception as e:
        logging.error(f"Error inserting row {file_path}: {e}")
        save_progress(file_path, 0)
        return False
    connection.commit()
    return True

Anything Else?

No response

Are you willing to submit PR?

Code of Conduct

wulishann33 commented 4 months ago
 if connection.is_connected():
        connection.close()
        logging.info("Doris connection closed")