EMCECS / ecs-sync

ecs-sync is a bulk copy utility that can move data between various systems in parallel
Apache License 2.0
61 stars 22 forks source link

single thread vs. multiple threads for large number of files? #54

Closed DEO5294 closed 5 years ago

DEO5294 commented 5 years ago

I am currently using the ESC-SYNC tool to copy 10.5 million files using CAS from ATMOS to ECS with 16 threads. Is it better to use multiple threads or single thread? This is running on a Virtual Machine with 16 cores, 16GB of RAM. The CPU is currently running between 9-11% utilization and bandwitch is running at 2.1Ms/s. Looking at ways to increase performance. I have 4 more of these type of data transfers. Any suggestions?

DEO5294 commented 5 years ago

image

Here is a screen shot of what I am seeing

twincitiesguy commented 5 years ago

@DEO5294 You should always use multiple threads; that is the primary strength of ecs-sync (parallel processing). This seems very strange. Are you providing a clip list for these migrations? It would seem that you are not, since I don't see any way this could happen if you were.

Typically, CAS migrations should always use a clip list provided by the application database. This ensures you copy only the clips the application cares about (no garbage) and that you verify all of them (there are no missing clips in the source pool).

The alternative is to use CAS query to enumerate the clips in the source, but this is not very reliable, especially on Atmos. If the query process fails, you basically have to start over (no data is copied twice, but enumeration takes quite a while).

You are also using estimation (enabled by default under advanced options). This spins up a second thread pool just for estimating size and quantity of objects. For most types of migrations this is good because it gives you a more accurate ETA. However, for CAS, it can put undo strain on the source cluster, so we recommend turning it off. Also, if the estimation query thread gets an error, you won't see it in the UI (it will only appear in the log somewher), and you end up with your situation above, where the total copied exceeds the estimated counts.

Hopefully this explains your situation, but if not, please provide more detail.

DEO5294 commented 5 years ago

Thanks for the Reply, I am using a clip file for the migration. I am not sure what else I could have done to improve performance. I did watch the ECS-SYNC appliance closely for performance. I did watch the ESXi performance tools for this VM and did not notice anything that could have slowed it down. Everything went smoothly. I used 16 threads for all the copying. I copied 85 million objects from Atmos to ECS. I broke the objects down into 5 million object batches. I even used the schedule feature which was helpful. I wish the STATUS would display the source file list. Maybe this should be a future enhancement.

twincitiesguy commented 5 years ago

As long as all 85 million were copied and verified (assume you enabled verification and there were no errors), then you should be good to go. The only thing I would change next time would be to turn of estimation.

twincitiesguy commented 5 years ago

Incidentally, you can get an all-object report for auditing from each job in the UI before you archive it. Then combine all the reports into one spreadsheet. Alternatively, if you already archived them, but named your database table, you can query mysql directly and pull the report from there. Instructions are on the wiki.