Fix bug where for larger datasets the metrics are incorrect

Summary

Fix for issue https://github.com/sambanova/generative_data_prep/issues/53

For small datasets the metrics worked correctly and matched with the output dataset, but for larger datasets the metrics would be really large numbers.

After a worker had finished, we would continue to sum its metrics if any other worker is running. This means that for large datasets, the workers are unlikely to finish at the same time, and the metrics would continue summing and explode in size.

PR Checklist

[x] My PR is less than 500 lines of code
[X] I have added sufficient comment as docstrings in my code
[X] I have made corresponding changes to the documentation

sambanova / generative_data_prep

Fix bug where for larger datasets the metrics are incorrect #79

Summary

PR Checklist