C2DH / jdh-notebook

A collection of Jupyter notebooks for the Journal of Digital History
https://journalofdigitalhistory.org
GNU Affero General Public License v3.0
4 stars 1 forks source link

Technical review Building an AI Research Assistant: Large Language Models as A Versatile Tool for Digital Historians #110

Closed eliselavy closed 11 months ago

eliselavy commented 1 year ago

Repo to create with pid JZx9gw7iwGxb. (here https://github.com/jdh-observer/JZx9gw7iwGxb) Wait for last version

Dr-Hutchinson commented 1 year ago

Final version submitted for technical review.

eliselavy commented 1 year ago

Problem to test it without paying an API key: https://platform.openai.com/account/usage

Screenshot 2023-05-23 at 16 23 34

https://openai.com/pricing

Multiple models, each with different capabilities and price points. Prices are per 1,000 tokens. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph is 35 tokens.

Screenshot 2023-05-23 at 16 35 21

eliselavy commented 1 year ago
eliselavy commented 1 year ago
eliselavy commented 1 year ago

Contact the author for a temporary API key - otherwise buy one

eliselavy commented 1 year ago

Purchase an API key, for using model: https://openai.com/waitlist/gpt-4-api

openai.ChatCompletion.create(
                      model="gpt-4",

Screenshot 2023-06-08 at 17 01 24

eliselavy commented 1 year ago

ModuleNotFoundError: No module named 'langchain.chat_models'

Dr-Hutchinson commented 1 year ago

There seems to be a version error causing this. For cell 24 change get rid of the install for 'openai[embeddings':

# Below are the modules and libraries needed for running the case studies below.
# Requires Python version 3.8

!pip install pandas
!pip install Pillow 
!pip install openai[embeddings] # eliminate this install
!pip install langchain
!pip install matplotlib seaborn

After that I was able to run the code without errors.

Additionally, I can send a temporary API key if needed for accessing GPT-4.

eliselavy commented 1 year ago

@Dr-Hutchinson fixed problem langchain by using specific version as initially provide in the requirements.txt Add only necessary package see here https://github.com/jdh-observer/JZx9gw7iwGxb/blob/gpt-35-turbo/requirements.txt Preference by running the install requirements.txt to be able to use nbconvert otherwise required a restart of the kernel

I run the analysis with gpt-3.5-turbo, see branch here: https://github.com/jdh-observer/JZx9gw7iwGxb/blob/gpt-35-turbo/article.ipynb One error dataframe contains only one element:
Exception: ValueError: 1 is not in range

Screenshot 2023-06-14 at 10 59 05 Linked to the gpt-3.5-turbo model? Still in waiting list for gpt-4 Maybe you can provide me your key ? (email: elisabeth.guerard@uni.lu)

eliselavy commented 1 year ago

From here: https://blog.finxter.com/game-changing-updates-from-openai-a-new-era-in-function-calling/?tl_inbound=1&tl_target_all=1&tl_form_type=1&tl_period_type=3

Screenshot 2023-06-15 at 09 06 15
eliselavy commented 1 year ago

Output will be saved.

eliselavy commented 11 months ago

Generation with appropriate key: ok Notebook needs to be trusted after running nbconvert to display correctly : Nbdiff between author execution (article) - nbconvert (article_executed) saved:nbdiff

eliselavy commented 11 months ago

sent to review @inactinique