migtools / pvc-migrate

Standalone PVC migration
Apache License 2.0
5 stars 13 forks source link

Test and document incremental staging #120

Closed fbladilo closed 4 years ago

fbladilo commented 4 years ago

This PR does the following :

Results for incremental staging and observations below :

Test case : single ns, 1 PVC, 1024 files per PV, 4GB total data size , ansible profile_tasks callback enabled for timers

# Initial sync

Tuesday 21 July 2020  12:20:07 -0500 (0:00:00.345)       0:03:41.822 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ------------------------------------------------------ 72.25s

# Re-run #1 , source data unchanged (unrealistic scenario)

Tuesday 21 July 2020  12:22:58 -0500 (0:00:00.351)       0:02:00.342 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ------------------------------------------------------- 3.71s

# Re-run #2 , 30% data change

Tuesday 21 July 2020  12:27:47 -0500 (0:00:00.338)       0:02:36.734 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ------------------------------------------------------- 26.78s

# Re-run #3 , 60% data change

Tuesday 21 July 2020  12:33:38 -0500 (0:00:00.318)       0:02:47.649 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ------------------------------------------------------- 50.38s

# Re-run #4 , 60% data change (using --inplace)

Tuesday 21 July 2020  12:39:18 -0500 (0:00:00.318)       0:02:44.731 ********** 
=============================================================================== 
Synchronizing files. This may take a while... -------------------------------------------------------- 48.36s

Observations : 

* 63% sync performance gain on workloads with 30% data churn
* 30% sync performance gain on workloads with 60% data churn
* --inline shows similar performance for these type of workloads

# Large file workload

Test case : single ns, 1 PVC, 1 file, 4GB total data size, ansible profile_tasks callback enabled for timers

# Initial sync

Tuesday 21 July 2020  15:00:57 -0500 (0:00:00.341)       0:03:42.119 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ---------------------------------------------------- 89.08s

# Re-run #1 , source data unchanged

Tuesday 21 July 2020  15:04:09 -0500 (0:00:00.318)       0:02:13.899 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ---------------------------------------------------- 2.76s

# Re-run #2 , data mangled

Tuesday 21 July 2020  15:13:06 -0500 (0:00:00.321)       0:04:44.559 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ---------------------------------------------------- 169.44s

# Re-run #3 , data mangled (using --inplace)

Tuesday 21 July 2020  15:19:19 -0500 (0:00:00.329)       0:04:21.568 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ----------------------------------------------------- 146.75s

# Re-run #4 , data mangled (using --whole-file)

Tuesday 21 July 2020  15:26:06 -0500 (0:00:00.312)       0:03:28.888 ********** 
=============================================================================== 
Synchronizing files. This may take a while... ---------------------------------------------------- 94.40s

# Re-run #5 , data mangled (using --inplace and --whole-file)

Tuesday 21 July 2020  15:37:08 -0500 (0:00:00.339)       0:03:18.884 ********** 
=============================================================================== 
Synchronizing files. This may take a while... --------------------------------------------------- 84.17s

Observations for large files : 

* When using rsync defaults take almost double the time when data needs update, rsync delta xfer algo enabled and temporary copies being made , urgh
* --inplace yields a marginal performance increase, less overhead on the transfer pod as no temporary copy is being made
* --whole-file sync performance almost on-par with initial sync, seems like we have faster network than backend I/O bandwith (glusterfs)
* --whole-file and --inplace combo is the best performing option once initial sync