This project aims at making a pipeline to takes detailed notes using chatGPT and Whisper. Whisper runs locally (in GPY if possible) to save cost.
Youtube URL --> Video --> Script --> Several block of texts --> several chatGPT 3.5 summary --> 1 concatenated big summary
Works for any media (audio, video) not only youtube.
In your bashrc file :
export OPENAI_KEY="sk-************************************"
conda env create -f environment.yml
Please note this pipeline can take as input any video / audio input, not only youtube video.
Enter your url in the variable url_list
of download_from_youtube.py
python download_from_youtube.py
python main.py --input_dir_or_file /path/to/video_or_audio --final_output_dir /path_to_save_markdown
Note that the chatGPT prompts are in french, do not hesitate to modify them within the file constants.py
book : 146.414 tokens Balex une heure : 6000 David une heure 30 : 29000
GPT4 turbo : 32 k tokens max, but doesn't works that well
0.01 centimes pour 1k tokens entrée