zilliztech / milvus-backup

Backup and restore tool for Milvus
Apache License 2.0
128 stars 46 forks source link

[Bug]: Restore failed with error `workerpool: execute job bulk insert fail, info: fail to initialize in-memory segment data for shard id 0` and then the following task gets stucked #235

Open zhuwenxing opened 11 months ago

zhuwenxing commented 11 months ago

Current Behavior

[2023-11-07 03:34:56 - INFO - ci_test]: *********************************** setup *********************************** (client_base.py:34)
[2023-11-07 03:34:56 - INFO - ci_test]: [setup_method] Start setup test case test_milvus_restore_back_with_array_datatype. (client_base.py:35)
-------------------------------- live log call ---------------------------------
[2023-11-07 03:34:56 - INFO - ci_test]: [test][2023-11-07T03:34:56Z] [0.24393133s] restore_backup_kaWuRgLz insert -> (insert count: 3000, delete count: 0, upsert count: 0, timestamp: 445468664633491458, success count: 3000, err count: 0) (wrapper.py:30)
[2023-11-07 03:34:59 - INFO - ci_test]: create_backup {'requestId': '9f4716e0-7d1e-11ee-802c-6045bd4a2b6e', 'msg': 'success', 'data': {'id': '9f4716e0-7d1e-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'start_time': 1699328096980, 'end_time': 1699328099939, 'name': 'backup_G2bZwon0', 'backup_timestamp': 1699328096980, 'collection_backups': [{'collection_id': 445468109023529292, 'db_name': 'default', 'collection_name': 'restore_backup_kaWuRgLz', 'schema': {'name': 'restore_backup_kaWuRgLz', 'fields': [{'fieldID': 100, 'name': 'int64', 'is_primary_key': True, 'data_type': 5}, {'fieldID': 101, 'name': 'key', 'data_type': 5, 'is_partition_key': True}, {'fieldID': 102, 'name': 'var_array', 'data_type': 22, 'type_params': [{'key': 'max_length', 'value': '1500'}, {'key': 'max_capacity', 'value': '2000'}], 'element_type': 21}, {'fieldID': 103, 'name': 'int_array', 'data_type': 22, 'type_params': [{'key': 'max_length', 'value': '1500'}, {'key': 'max_capacity', 'value': '2000'}], 'element_type': 5}, {'fieldID': 104, 'name': 'float_vector', 'data_type': 101, 'type_params': [{'key': 'dim', 'value': '128'}]}], 'enable_dynamic_field': True}, 'backup_timestamp': 445468664397824, 'size': 54421, 'has_index': False, 'load_state': 'NotLoad', 'backup_physical_timestamp': 1699328096}], 'size': 54421, 'milvus_version': 'c41df18b-dev'}} (test_restore_backup.py:361)
[2023-11-07 03:34:59 - INFO - ci_test]: list_backup {'requestId': 'a10cb05d-7d1e-11ee-802c-6045bd4a2b6e', 'msg': 'success', 'data': [{'id': '6ecb9faa-7d1d-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_1DWbFWhW', 'backup_timestamp': 1699327586143, 'size': 0}, {'id': '987fc642-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_3qHDx4Pq', 'backup_timestamp': 1699327226614, 'size': 0}, {'id': 'cf793531-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_49dSBBui', 'backup_timestamp': 1699326889349, 'size': 0}, {'id': 'f84196f3-7d19-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_4dVIYtm0', 'backup_timestamp': 1699326098777, 'size': 0}, {'id': '90e55f41-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_5BrbDtOJ', 'backup_timestamp': 1699326784361, 'size': 0}, {'id': '2d2dc4ed-7d1e-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_BItHq9yb', 'backup_timestamp': 1699327905554, 'size': 0}, {'id': 'f3a9456f-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_C9mVnx7D', 'backup_timestamp': 1699326950062, 'size': 0}, {'id': '30f7a5ef-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_CsHLI9gi', 'backup_timestamp': 1699326193923, 'size': 0}, {'id': 'a33b3385-7d1d-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_DHsjZ9iF', 'backup_timestamp': 1699327674116, 'size': 0}, {'id': '69c75706-7d1e-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_DqKVx4nr', 'backup_timestamp': 1699328007224, 'size': 0}, {'id': '276a0ee2-7d1d-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_FIA480ux', 'backup_timestamp': 1699327466386, 'size': 0}, {'id': 'bdd4337b-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_FwGsxp05', 'backup_timestamp': 1699326430249, 'size': 0}, {'id': '9f4716e0-7d1e-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_G2bZwon0', 'backup_timestamp': 1699328096980, 'size': 0}, {'id': '9b1081e9-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_GuC5snfw', 'backup_timestamp': 1699326371924, 'size': 0}, {'id': '187c0e99-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_Hzxzrt0j', 'backup_timestamp': 1699326152848, 'size': 0}, {'id': 'b188b7fc-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_I0AppdB8', 'backup_timestamp': 1699327268616, 'size': 0}, {'id': '776f1ab2-7d1e-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_LYFUyzEA', 'backup_timestamp': 1699328030133, 'size': 0}, {'id': '144ec2d5-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_MbMBYdml', 'backup_timestamp': 1699326575337, 'size': 0}, {'id': '1778bcad-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_PXojKxDr', 'backup_timestamp': 1699327010142, 'size': 0}, {'id': '4930524d-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_R9lF6IN7', 'backup_timestamp': 1699326234560, 'size': 0}, {'id': '3ac3af8a-7d1e-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_Rm0J2ONz', 'backup_timestamp': 1699327928347, 'size': 0}, {'id': '4b41c0f4-7d1d-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_TI5gWCZe', 'backup_timestamp': 1699327526520, 'size': 0}, {'id': '5e3d23c4-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_Tzca7eK5', 'backup_timestamp': 1699326699372, 'size': 0}, {'id': '35ac996f-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_UXNCpvYk', 'backup_timestamp': 1699326631316, 'size': 0}, {'id': '03bbbd94-7d1d-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_W50JBbxh', 'backup_timestamp': 1699327406523, 'size': 0}, {'id': '45cb543e-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_XPxPkTdF', 'backup_timestamp': 1699326658361, 'size': 0}, {'id': 'ca541193-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_YrYw4dfF', 'backup_timestamp': 1699327310213, 'size': 0}, {'id': '0866524d-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_Zh7e0s9t', 'backup_timestamp': 1699326125862, 'size': 0}, {'id': 'e1fcbb38-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_bCxhaqqr', 'backup_timestamp': 1699326490913, 'size': 0}, {'id': 'd78903a0-7d19-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_clnegESI', 'backup_timestamp': 1699326043881, 'size': 0}, {'id': '4d6382aa-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_dUQhnREM', 'backup_timestamp': 1699327100600, 'size': 0}, {'id': '602cac92-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_eMy5AQTO', 'backup_timestamp': 1699326273123, 'size': 0}, {'id': '7ea31965-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_iIkxV3Qe', 'backup_timestamp': 1699327183225, 'size': 0}, {'id': '250159ce-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_iM4UT3iv', 'backup_timestamp': 1699326603351, 'size': 0}, {'id': '03ed2b6c-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_lySTWJsQ', 'backup_timestamp': 1699326547853, 'size': 0}, {'id': '779c56ef-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_n9fqvilM', 'backup_timestamp': 1699326741939, 'size': 0}, {'id': '3c863810-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_nHuTaGRX', 'backup_timestamp': 1699327072306, 'size': 0}, {'id': '781ac3f4-7d1a-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_qNXEQAxn', 'backup_timestamp': 1699326313271, 'size': 0}, {'id': 'e243a5d3-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_sASodnFO', 'backup_timestamp': 1699327350371, 'size': 0}, {'id': 'e776bb9c-7d19-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_sVbBUjxB', 'backup_timestamp': 1699326070605, 'size': 0}, {'id': '6e89bab8-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_vKp2Ugpv', 'backup_timestamp': 1699327156215, 'size': 0}, {'id': 'aa84fdc9-7d1b-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_wZcWbDn5', 'backup_timestamp': 1699326827350, 'size': 0}, {'id': '5e11a8d0-7d1c-11ee-802c-6045bd4a2b6e', 'state_code': 2, 'name': 'backup_wzvHoGAw', 'backup_timestamp': 1699327128584, 'size': 0}]} (test_restore_backup.py:363)
[2023-11-07 03:35:05 - INFO - ci_test]: restore_backup: {'requestId': 'a10ffb51-7d1e-11ee-802c-6045bd4a2b6e', 'code': 3, 'msg': 'workerpool: execute job bulk insert fail, info: fail to initialize in-memory segment data for shard id 0', 'data': {'id': 'a1100a80-7d1e-11ee-802c-6045bd4a2b6e', 'state_code': 1, 'start_time': 1699328099, 'collection_restore_tasks': [{'id': 'a1109715-7d1e-11ee-802c-6045bd4a2b6e', 'state_code': 1, 'start_time': 1699328099, 'target_collection_name': 'restore_backup_kaWuRgLz_bak', 'restored_size': 0, 'to_restore_size': 54421, 'target_db_name': 'default'}], 'restored_size': 0, 'to_restore_size': 0}} (test_restore_backup.py:375)
FAILED

image

Expected Behavior

No response

Steps To Reproduce

No response

Environment

No response

Anything else?

failed job: https://github.com/zilliztech/milvus-backup/actions/runs/6779416414/job/18426449374?pr=231

log: https://github.com/zilliztech/milvus-backup/suites/17962670186/artifacts/1032844586

zhuwenxing commented 11 months ago

Milvus version: master-latest

zhuwenxing commented 11 months ago

/assign @wayblink

wayblink commented 11 months ago

image

wayblink commented 11 months ago

Bulkinsert not support array type yet

mkotsalainen commented 10 months ago

Do you an idea of when array will be supported? We can't use that feature until we can backup / restore it.

wayblink commented 4 months ago

/assign

wayblink commented 4 months ago

@zhuwenxing 确认下 我记得array类型已经支持了 CI也有了 对吧

zhuwenxing commented 4 months ago

@zhuwenxing 确认下 我记得array类型已经支持了 CI也有了 对吧

Yes, the array should already be supported in master, 2.4-latest, and 2.3-latest. and it is also included in CI test cases.