Closed s-udhaya closed 1 year ago
@selitvin I am in the process of making end to end DDL demos with big datasets, petastorm and horovod + DDP strategies. Many thanks for the great library. I encountered OOM issue during the process and this fix will help me to alleviate that issue. Could you please have a look at it whenever you have time?
Base: 86.25% // Head: 86.25% // No change to project coverage :thumbsup:
Coverage data is based on head (
7340ed6
) compared to base (170b22a
). Patch coverage: 100.00% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@selitvin Many thanks for the review. I will update the PR with your suggestions.
@selitvin It took a while for me to update the PR 😃 Please have a look at it whenever you have time. Thanks
@selitvin Thanks for the approval. What is the process now to get this PR to be merged onto the main branch?
Merged. I need to cut a release now that will include your change.
I am pushing a release candidate v0.12.1rc0 now. Will release tomorrow.
The default results_queue_size used in make_batch_reader is 50. This will lead to OOM exception while loading big datasets with thread reader pool type. By exposing results_queue_size parameter via make_batch_reader api, the user is able to control the prefetched data size which in turns will help to alleviate OOM issue.