The convert_texts_to_vector() function has been updated to support batching. The batch_size parameter, set to 50 by default, determines the number of items per batch. An additional parameter, max_batches, controls the maximum number of allowed batches, with a default value of 2000. When batch_size is set to 0, batching is disabled.
This modification ensures better handling of large datasets, optimizing the function for scalability and efficiency.
Fixes # 56
Type of change
[ ] New feature (non-breaking change which adds functionality)
Testing
A unit test, test_batch(), has been added to validate the batching functionality. It:
Sends 10 items to convert_texts_to_vector().
Sets batch_size to 5.
Verifies that the items are divided into 2 batches of 5 each.
Test Configuration
Library version:
OS: Windows11
Toolchain: 3.10.0
Checklist
[ ] My code follows the style guidelines of this project
[ ] I have performed a self-review of my own code
[ ] I have commented my code, particularly in hard-to-understand areas
[ ] I have made corresponding changes to the documentation
[ ] My changes generate no new warnings
[ ] I have added tests that prove my fix is effective or that my feature works
[ ] New and existing unit tests pass locally with my changes
[ ] Any dependent changes have been merged and published in downstream modules
[ ] I have checked my code and corrected any misspellings
Description
The convert_texts_to_vector() function has been updated to support batching. The batch_size parameter, set to 50 by default, determines the number of items per batch. An additional parameter, max_batches, controls the maximum number of allowed batches, with a default value of 2000. When batch_size is set to 0, batching is disabled.
This modification ensures better handling of large datasets, optimizing the function for scalability and efficiency.
Fixes # 56
Type of change
Testing
A unit test, test_batch(), has been added to validate the batching functionality. It:
Test Configuration
Checklist