Open naufalso opened 3 days ago
@naufalso Thanks for your input. Please see the following two discussions to understand our take on this:
@naufalso Thanks for your input. Please see the following two discussions to understand our take on this:
Thank you @cau-git for the clarification and for pointing me to the relevant discussions.
I now see that this feature has already been considered and defined in the roadmap. I truly appreciate the team's great work on this project, and I look forward to the upcoming updates.
Please don't hesitate to reach out if there's any way I can contribute further.
Keep up the fantastic work!
Requested feature
I propose adding a parallelization option to the
convert_all()
function by introducing an additional parameter, such asnum_worker
. This feature would allow users to specify the number of workers to process conversions concurrently, significantly improving performance for large datasets.Currently, the
convert_all()
function processes documents sequentially by returning an iterator. This approach can be slow when dealing with a large number of documents. Parallelization would enable faster processing and better utilization of multi-core systems.Proposed changes:
num_worker
parameter to theconvert_all()
function.concurrent.futures
ormultiprocessing
) to handle multiple conversion tasks simultaneously.Example usage:
Alternatives