MilvusDM (Milvus Data Migration) is a data migration tool for Milvus that supports importing Faiss and HDF5 data files into Milvus, migrating data between Milvus, and it also supports batch backup of Milvus data to local files. Using milvusdm can help developers improve usage efficiency, reduce operation and maintenance costs.
Operating system | Supported versions |
---|---|
CentOS | 7.5 or higher |
Ubuntu LTS | 18.04 or higher |
Software | Version |
---|---|
Milvus | 0.10.x or 1.x or 2.x |
Python3 | 3.7 or higher |
pip3 | Corresponds to python version. |
Add the following two lines to ~/.bashrc
file:
export MILVUSDM_PATH='/home/$user/milvusdm'
export LOGS_NUM=0
MILVUSDM_PATH: This parameter defines the working path of milvusdm. Logs and data generated by Milvusdm will be stored in this path. The default value is
/home/$user/milvusdm
.LOGS_NUM: Milvusdm log file generates one per day. This parameter defines the number of log files to be saved. The default value is 0, which means all log files are saved.
Make the configured environment variables:
$ source ~/.bashrc
$ pip3 install pymilvusdm==2.0
The pymilvusdm2.0 is used to migrate data from Milvus(0.10.x or 1.x) to Milvus2.x.
Export one Faiss index file to Milvus in a specified collection or partition.
In the current version, only flat and ivf_flat indexes for floating data are supported.
$ wget https://raw.githubusercontent.com/milvus-io/milvus-tools/main/yamls/F2M.yaml
F2M:
milvus_version: 1.x
data_path: '/home/data/faiss1.index'
dest_host: '127.0.0.1'
dest_port: 19530
mode: 'append'
dest_collection_name: 'test'
dest_partition_name: ''
collection_parameter:
dimension: 256
index_file_size: 1024
metric_type: 'L2'
- Optional parameters:
dest_partition_name
- The Parameter
mode
can be selected fromappend
,skip
,overwrite
. This parameter takes effect only when the specified collection name exists in Milvus library. append: Append data to the existing collection
skip: Skip the existing collection and do not perform any operations
overwrite: Delete the old collection, create a new collection with the same name and then import the data.
parameter | description | example |
---|---|---|
F2M | Task: Export data in HDF5 to Milvus. | |
milvus_version | Version of Milvus. | 0.10.5 |
data_path | Path to the data in Faiss. | '/home/user/data/faiss.index' |
dest_host | Milvus server address | '127.0.0.1' |
dest_port | Milvus server port. | 19530 |
mode | Mode of migration. | 'append' |
dest_collection_name | Name of the collection to import data to. | 'test' |
dest_partition_name | Name of the partition to import data to. (Optional) | 'partition' |
collection_parameter | Collection-specific information such as vector dimension, index file size, and similarity metric. | dimension: 512 index_file_size: 1024 metric_type: 'HAMMING' |
$ milvusdm --yaml F2M.yaml
Export one or more HDF5 files to Milvus in a specified collection or partition.
We provide the HDF5 examples of float vectors(dim-100) and binary vectors(dim-512) and their corresponding ids.
$ wget https://raw.githubusercontent.com/milvus-io/milvus-tools/main/yamls/H2M.yaml
H2M:
milvus-version: 1.x
data_path:
- /Users/zilliz/float_1.h5
- /Users/zilliz/float_2.h5
data_dir:
dest_host: '127.0.0.1'
dest_port: 19530
mode: 'overwrite' # 'skip/append/overwrite'
dest_collection_name: 'test_float'
dest_partition_name: 'partition_1'
collection_parameter:
dimension: 128
index_file_size: 1024
metric_type: 'L2'
or
H2M:
milvus_version: 1.x
data_path:
data_dir: '/Users/zilliz/HDF5_data'
dest_host: '127.0.0.1'
dest_port: 19530
mode: 'append' # 'skip/append/overwrite'
dest_collection_name: 'test_binary'
dest_partition_name:
collection_parameter:
dimension: 512
index_file_size: 1024
metric_type: 'HAMMING'
- Optional parameters:
dest_partition_name
- Just configure
data_path
ordata_dir
, while the other one is None.
parameter | description | example |
---|---|---|
H2M | Task: Export data in HDF5 to Milvus. | |
milvus_version | Version of Milvus. | 0.10.5 |
data_path | Path to the HDF5 file. | - /Users/zilliz/float_1.h5 - /Users/zilliz/float_1.h5 |
data_dir | Directory of the HDF5 files. | /Users/zilliz/Desktop/HDF5_data |
dest_host | Milvus server address. | '127.0.0.1' |
dest_port | Milvus server port. | 19530 |
mode | Mode of migration | 'append' |
dest_collection_name | Name of the collection to import data to. | 'test_float' |
dest_partition_name | Name of the partition to import data to.(optional) | 'partition_1' |
collection_parameter | Collection-specific information such as vector dimension, index file size, and similarity metric. | dimension: 512 index_file_size: 1024 metric_type: 'HAMMING' |
$ milvusdm --yaml H2M.yaml
MilvusDM does not support migrating data from Milvus 2.0 standalone to Milvus 2.0 cluster.
Copy a collection of source_milvus or multiple partitions of a collection into the corresponding collection or partition in dest_milvus.
$ wget https://raw.githubusercontent.com/milvus-io/milvus-tools/main/yamls/M2M.yaml
M2M:
milvus_version: 1.x
source_milvus_path: '/home/user/milvus'
mysql_parameter:
host: '127.0.0.1'
user: 'root'
port: 3306
password: '123456'
database: 'milvus'
source_collection: # specify the 'partition_1' and 'partition_2' partitions of the 'test' collection.
test:
- 'partition_1'
- 'partition_2'
dest_host: '127.0.0.1'
dest_port: 19530
mode: 'skip' # 'skip/append/overwrite'
Or
M2M:
milvus_version: 1.x
source_milvus_path: '/home/user/milvus'
mysql_parameter:
source_collection: # specify the collection named 'test'
test:
dest_host: '127.0.0.1'
dest_port: 19530
mode: 'skip' # 'skip/append/overwrite'
- Required parameters:
source_milvus_path
,source_collection
,dest_host
,dest_port
andmode
.- If you are using MySQL to manage source_milvus metadata, configure the
mysql_parameter
parameter, which is empty if you are using SQLite.- The
source_collection
parameter must specify a collection name, and the following partition name is optional and multiple partitions can be added.
parameter | description | example |
---|---|---|
M2M | Task: Copy the data from Milvus to the same version of Milvus. | |
milvus_version | The dest-milvus version. | 0.10.5 |
source_milvus_path | Working directory of the source Milvus. | '/home/user/milvus' |
mysql_parameter | MySQL settings for the source Milvus, including mysql host , user , port , password and database parameters. |
host: '127.0.0.1' user: 'root' port: 3306 password: '123456' database: 'milvus' |
source_collection | Names of the collection and its partitions in the source Milvus. | test: - 'partition_1' - 'partition_2' |
dest_host | Target Milvus server address. | '127.0.0.1' |
dest_port | Target Milvus server port. | 19530 |
mode | Mode of migration | 'skip' |
Usage
It will copy the source_milvus
collection data to dest_milvus
.
$ milvusdm --yaml M2M.yaml
Export a Milvus collection or multiple partitions of a collection to a local HDF5 format file.
$ wget https://raw.githubusercontent.com/milvus-io/milvus-tools/main/yamls/M2H.yaml
M2H:
milvus_version: 1.x
source_milvus_path: '/home/user/milvus'
mysql_parameter:
host: '127.0.0.1'
user: 'root'
port: 3306
password: '123456'
database: 'milvus'
source_collection: # specify the 'partition_1' and 'partition_2' partitions of the 'test' collection.
test:
- 'partition_1'
- 'partition_2'
data_dir: '/home/user/data'
Or
M2H:
milvus_version: 1.x
source_milvus_path: '/home/user/milvus'
mysql_parameter:
source_collection: # specify the collection named 'test'
test:
data_dir: '/home/user/data'
- The
source_milvus_path
,source_collection
, anddata_dir
parameters are required.- If you are using MySQL to manage source_milvus metadata, configure the
mysql_parameter
parameter, which is empty if you are using SQLite.- The
source_collection
parameter must specify a collection name, and the following partition name is optional and multiple partitions can be added.
parameter | description | example |
---|---|---|
M2H | Task: Export Milvus data to local HDF5 format files. | |
milvus_version | The source-milvus version. | 0.10.5 |
source_milvus_path | Working directory of Milvus. | '/home/user/milvus' |
mysql_parameter | MySQL settings for Milvus, including mysql host , user , port , password and database parameters. |
host: '127.0.0.1' user: 'root' port: 3306 password: '123456' database: 'milvus' |
source_collection | Names of the collection and its partitions in Milvus. | test: - 'partition_1' - 'partition_2' |
data_dir | Directory to save HDF5 files. | '/home/user/data' |
Usage
It will generate the corresponding hfd5 format file and H2M configuration file in the data_dir
directory.
$ milvusdm --yaml M2H.yaml
If you would like to contribute code to this project, you can find out more about our code structure:
debug
/info
/error
logs during runtime.source_collection='*'
, all Milvus data is exported.