This code can be used to convert PDFs into audio podcasts, lectures, summaries, and more. It uses OpenAI's GPT models for text generation and text-to-speech conversion. You can also edit a draft transcript (multiple times) and provide specific comments, or overall directives on how it could be adapted or improved.
Follow these steps to set up PDF2Audio on your local machine using Conda:
Clone the repository:
git clone https://github.com/lamm-mit/PDF2Audio.git
cd PDF2Audio
Install Miniconda (if you haven't already):
conda --version
Create a new Conda environment:
conda create -n pdf2audio python=3.9
Activate the Conda environment:
conda activate pdf2audio
Install the required dependencies:
pip install -r requirements.txt
Set up your OpenAI API key:
Create a .env
file in the project root directory and add your OpenAI API key:
OPENAI_API_KEY=your_api_key_here
To run the PDF2Audio app:
Ensure you're in the project directory and your Conda environment is activated:
conda activate pdf2audio
Run the Python script that launches the Gradio interface:
python app.py
Open your web browser and go to the URL provided in the terminal (typically http://127.0.0.1:7860
).
Use the Gradio interface to upload a PDF file and convert it to audio.
This app requires an OpenAI API key to function.
This project was inspired by and based on the code available at https://github.com/knowsuchagency/pdf-to-podcast and https://github.com/knowsuchagency/promptic.
@article{ghafarollahi2024sciagentsautomatingscientificdiscovery,
title={SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning},
author={Alireza Ghafarollahi and Markus J. Buehler},
year={2024},
eprint={2409.05556},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2409.05556},
}
@article{buehler2024graphreasoning,
title={Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph Reasoning},
author={Markus J. Buehler},
journal={Machine Learning: Science and Technology},
year={2024},
url={http://iopscience.iop.org/article/10.1088/2632-2153/ad7228},
}