Closed leventidis closed 1 year ago
Could you show me your code?
Sure, I've attached it below:
import time
import argparse
import pandas as pd
from UnlimitedGPT import ChatGPT
from pathlib import Path
from tqdm import tqdm
session_token = "my_session_token"
conversation_id = "my_conversation_id"
prompt='Find all entities and their associated wikipedia links (if possible) in the following text. Perform this process sentence by sentence. Output must be a JSON file as follows: {"entity":<entity>, "sentence_id":<sentence ID>, "link":<wikipedia link if possible otherwise null>}'
def get_chatbot():
print("Initializing chatbot...")
start=time.time()
chatbot = ChatGPT(
session_token,
conversation_id=conversation_id,
proxy=None,
chrome_args=[],
disable_moderation=False,
verbose=False,
headless=True
)
print("Finished initialization.")
print("Elapsed Time:", time.time()-start, 'seconds.\n')
return chatbot
def main(args):
df = pd.read_pickle(args.input_df_path)
chatbot=get_chatbot()
print("Extracting entities...")
for article_id in tqdm(df['article_id'].unique()):
content=df[df['article_id']==article_id]['content_coreferenced'].values[0]
# Extract entites using ChatGPT
input_txt=prompt+content
input_txt=input_txt.replace('\n', ' ')
chatGPT_output = chatbot.send_message(
input_txt,
input_mode="INSTANT", input_delay=0.1, continue_generating=True
)
with open(args.output_dir+'raw_output/'+str(article_id)+'.txt', "w") as text_file:
text_file.write(chatGPT_output.response)
time.sleep(2)
if __name__ == "__main__":
# -------------------------- Argparse Configuration -------------------------- #
parser = argparse.ArgumentParser(description='Use ChatGPT to extract named entities from text')
parser.add_argument('--output_dir', required=True,
help='Path to the output directory where output files are stored. Path must terminate with backslash "\\"')
parser.add_argument('--input_df_path', required=True,
help='Path to the input dataframe file that contains the contents for each article.')
# Parse the arguments
args = parser.parse_args()
print('\nOutput directory:', args.output_dir)
print('Input dataframe path:', args.input_df_path)
print('\n')
Path(args.output_dir).mkdir(parents=True, exist_ok=True)
Path(args.output_dir+'raw_output/').mkdir(parents=True, exist_ok=True)
main(args)
The above is a simple script to extract entities using ChatGPT from text (some news articles). The input is a pandas dataframe that contains the text of the input articles.
The script seems to work for the first 4-5 iterations but then fails return the stacktrace in the original post.
I' am running it using Python 3.8.17 with the following package versions:
aiohttp==3.8.4
aiosignal==1.3.1
async-generator==1.10
async-timeout==4.0.2
attrs==23.1.0
beautifulsoup4==4.12.2
certifi==2023.5.7
charset-normalizer==3.1.0
exceptiongroup==1.1.1
frozenlist==1.3.3
h11==0.14.0
idna==3.4
markdownify==0.11.6
multidict==6.0.4
numpy==1.24.3
openai==0.27.8
outcome==1.2.0
pandas==2.0.2
PySocks==1.7.1
python-dateutil==2.8.2
pytz==2023.3
requests==2.31.0
selenium==4.10.0
six==1.16.0
sniffio==1.3.0
sortedcontainers==2.4.0
soupsieve==2.4.1
tqdm==4.65.0
trio==0.22.0
trio-websocket==0.10.3
tzdata==2023.3
undetected-chromedriver==3.5.0
UnlimitedGPT==0.1.6
urllib3==2.0.3
websockets==11.0.3
wsproto==1.2.0
yarl==1.9.2
Update to the new version, I believe it fixes this issue.
I updated to the new version but I am unfortunately receiving the same error.
It processes a few queries and then it outputs the same selenium.common.exceptions.TimeoutException
error
I updated to the new version but I am unfortunately receiving the same error.
It processes a few queries and then it outputs the same
selenium.common.exceptions.TimeoutException
error
Can you check the browser and see when its typing and waiting for the reply and try to watch it as the timeout exception is raised? cause if it works then occasionally it raises that, its not a bug, its just chatgpt taking too much time to respond, and if that was the case then its normal for the exception to be raised, u can handle it using try/except block and u can also increase the timeout span
It's same happening to me, i hosted it on linux
Fixed in the new version, update to 0.1.8
by running python3 -m pip install UnlimitedGPT -U
or pip install UnlimitedGPT -U
and try your code again then let me know if you still face this issue - you shouldn't face it anymore though.
I followed the documentation and sending queries to chatGPT via send_message() seems to work well, however after sending just a few queries (about 4-5) in a loop I am receiving a selenium TimeoutException. I've pasted below my full stacktrace
Is this failing because I am sending too many requests. If I want to repeatedly query chatGPT with different queries (in the order of a couple thousands) how should I proceed?
Thanks!