Divide documents and upload text segments to Pinecone with this python script.
Clone the repository:
git clone https://github.com/kirill-markin/split-documents-upload-to-pinecone.git
Install the required packages:
cd split-documents-upload-to-pinecone
pip install -r requirements.txt
Set up environment variables in a .env
file in the project root based on the .env.example
file:
OPENAI_API_KEY=your_openai_api_key
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment
PINECONE_INDEX_NAME=your_index_name
Replace your_pinecone_api_key
and your_pinecone_environment
with your actual values.
your_pinecone_environment
— the name of the Pinecone environment you want to use. For example us-central1-gcp
.
your_index_name
— the name of the index you want to create. You can use any name.
Add documents to the data
folder. It can be a single file or multiple files in inner folders. The script will process all files *.md
in the folder.
Run the script:
python3 main.py
This project is licensed under the MIT License.