Adding batching feature for openAI

MrTuanDao commented 1 month ago

I believe the batching feature would be extremely beneficial, as we often crawl a large number of websites, which can lead to rate limits. Implementing batching would help manage this issue more effectively and improve overall efficiency.

unclecode commented 1 month ago

Hi, thank you so much for your suggestion. I think you're very right. It also brings down the cost as well. However, at the same time, we don't want to be overly dependent on a single specific AI provider. We aim to keep the library neutral or, essentially, LLM provider agnostic. To address this, one can create an create a class that inherits from the current LLMExtractionStrategy class. In that class, you can use OpenAI batching. By doing it this way, you can support OpenAI batching without reducing the generalization level of the library. I appreciate your interest in this. If you're willing, you could fork the repository, apply the changes, and send the pull request. We would add your batching as one of the utility classes so users can benefit from it. Please let me know if you are interested.

MrTuanDao commented 1 month ago

Hi, I apologize for not having enough time to handle this at the moment, as I’m currently a bit busy. I'll leave the issues here, hoping someone else can take care of them. Thanks!

On Tue, Oct 8, 2024 at 6:05 PM UncleCode @.***> wrote:

Hi, thank you so much for your suggestion. I think you're very right. It also brings down the cost as well. However, at the same time, we don't want to be overly dependent on a single specific AI provider. We aim to keep the library neutral or, essentially, LLM provider agnostic. To address this, one can create an create a class that inherits from the current LLMExtractionStrategy class. In that class, you can use OpenAI batching. By doing it this way, you can support OpenAI batching without reducing the generalization level of the library. I appreciate your interest in this. If you're willing, you could fork the repository, apply the changes, and send the pull request. We would add your batching as one of the utility classes so users can benefit from it. Please let me know if you are interested.

— Reply to this email directly, view it on GitHub https://github.com/unclecode/crawl4ai/issues/140#issuecomment-2399540927, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYVFPMMGUIHGZ3UINOXG7ZLZ2O375AVCNFSM6AAAAABPPYH23SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJZGU2DAOJSG4 . You are receiving this because you authored the thread.Message ID: @.***>

unclecode / crawl4ai

Adding batching feature for openAI #140