networktocode / diffsync

A utility library for comparing and synchronizing different datasets.
https://diffsync.readthedocs.io/
Other
148 stars 27 forks source link

Parallel Processing #202

Open itdependsnetworks opened 1 year ago

itdependsnetworks commented 1 year ago

Environment

Proposed Functionality

The ability to process parts of the sync that do not effect each other.

This likely requires a facility that developers can describe what can be parallel as well as potentially finding specific scenarios where it can automagically happen.

Use Case

Use cases that would benefit from this include:

chadell commented 1 year ago

isn't this something you could infer based on the children relationships? I mean, all the object of one type, e.g site, should be able to be processed in parallel. Obviously, you can find shared children, but this conflict can be postponed for a final merge. Don't know if assuming, by default, that the objects of parent models can run in parallel makes sense. Then, if the library offers a remediation final step using the already supported Redis memory, you could run multiprocessing. does it make sense to you? or maybe missing some point?

itdependsnetworks commented 1 year ago

Some can be inferred, some may not be able to be inferred, but yes, that is the point