This pull request includes significant changes to the chunker and embedder modules to make the codebase asynchronous. The most important changes involve converting synchronous functions to asynchronous ones and updating the imports to use asynchronous versions of libraries. These changes will improve the performance and scalability of the document chunking and embedding processes.
Conversion to Asynchronous Code:
chunker/chunk_documents_docint.py: Converted functions such as get_secret, analyze_document_rest, get_chunk, and chunk_document to asynchronous versions. Updated the file to use aiohttp for HTTP requests and asyncio for handling asynchronous operations. [1][2][3][4][5][6][7][8][9][10]
chunker/chunk_documents_raw.py: Converted the chunk_document function to an asynchronous version and updated the function to use asynchronous methods for generating chunks with embeddings. [1][2][3][4]
chunker/chunk_metadata_helper.py: Converted methods in ChunkEmbeddingHelper to asynchronous versions, including the generate_chunks_with_embedding and _generate_content_metadata methods. Added a class method create for asynchronous initialization.
embedder/text_embedder.py: Converted the embed_content method to an asynchronous version and added a class method create for asynchronous initialization. Updated the get_secret function to be asynchronous. [1][2][3]
Minor Changes:
function_app.py: Updated the document_chunking route handler to be asynchronous.
This pull request includes significant changes to the
chunker
andembedder
modules to make the codebase asynchronous. The most important changes involve converting synchronous functions to asynchronous ones and updating the imports to use asynchronous versions of libraries. These changes will improve the performance and scalability of the document chunking and embedding processes.Conversion to Asynchronous Code:
chunker/chunk_documents_docint.py
: Converted functions such asget_secret
,analyze_document_rest
,get_chunk
, andchunk_document
to asynchronous versions. Updated the file to useaiohttp
for HTTP requests andasyncio
for handling asynchronous operations. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]chunker/chunk_documents_raw.py
: Converted thechunk_document
function to an asynchronous version and updated the function to use asynchronous methods for generating chunks with embeddings. [1] [2] [3] [4]chunker/chunk_metadata_helper.py
: Converted methods inChunkEmbeddingHelper
to asynchronous versions, including thegenerate_chunks_with_embedding
and_generate_content_metadata
methods. Added a class methodcreate
for asynchronous initialization.embedder/text_embedder.py
: Converted theembed_content
method to an asynchronous version and added a class methodcreate
for asynchronous initialization. Updated theget_secret
function to be asynchronous. [1] [2] [3]Minor Changes:
function_app.py
: Updated thedocument_chunking
route handler to be asynchronous.