Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
6.08k stars 4.16k forks source link

Multithreading for document upload #179

Open Chris4Sun opened 1 year ago

Chris4Sun commented 1 year ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

It would be great to make predocs process multithreading in scripts\predocs.py. Like the code below

from multiprocessing.pool import Pool def task(filename): if args.verbose: print(f"Processing '{filename}'") if args.remove: remove_blobs(filename) remove_from_index(filename) elif args.removeall: remove_blobs(None) remove_from_index(None) else: if not args.skipblobs: upload_blobs(filename) page_map = get_document_text(filename) sections = create_sections(os.path.basename(filename), page_map) index_sections(os.path.basename(filename), sections)

if args.removeall: remove_blobs(None) remove_from_index(None) else: if not args.remove: create_search_index()

print(f"Processing files...")
if __name__ == '__main__':
    with Pool(8) as pool:
        pool.map(task, glob.glob(args.files))

Thanks! We'll be in touch soon.

github-actions[bot] commented 10 months ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this issue will be closed.