harmonydata / harmony

The Harmony Python library: a research tool for psychologists to harmonise data and questionnaire items. Open source.
https://harmonydata.ac.uk
MIT License
8 stars 18 forks source link

Allow batching of items when sent to LLM #66

Closed makrianast closed 4 days ago

makrianast commented 4 days ago

Description

The convert_texts_to_vector() function has been updated to support batching. The batch_size parameter, set to 50 by default, determines the number of items per batch. An additional parameter, max_batches, controls the maximum number of allowed batches, with a default value of 2000. When batch_size is set to 0, batching is disabled.

This modification ensures better handling of large datasets, optimizing the function for scalability and efficiency.

Fixes # 56

Type of change

Testing

A unit test, test_batch(), has been added to validate the batching functionality. It:

Test Configuration

Checklist