expectedparrot / edsl

Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
https://docs.expectedparrot.com
MIT License
171 stars 18 forks source link

Update docs to reflect new way to check on Models Issue # 222 #285

Closed johnjosephhorton closed 5 months ago

johnjosephhorton commented 5 months ago

Issue #222 requested a method to check keys. We can now do:

>>> Model.check_models()

And it will iterate through the models, doing a hello-world type call to each model listed:

Now checking: claude-3-haiku-20240307
Current key is XXXXXXX
{'id': 'msg_01FL96CaknFAPiSn99pEQG1f', 'content': [{'text': "Hello! It's nice to meet you. How can I assist you today?", 'type': 'text'}], 'model': 'claude-3-opus-20240229', 'role': 'assistant', 'stop_reason': 'end_turn', 'stop_sequence': None, 'type': 'message', 'usage': {'input_tokens': 17, 'output_tokens': 19}}

Now checking: claude-3-opus-20240229
Current key is XXXXX
{'id': 'msg_013CnqQH1dGvihKVHabVSt4Q', 'content': [{'text': "Hello! It's nice to meet you. How can I assist you today?", 'type': 'text'}], 'model': 'claude-3-opus-20240229', 'role': 'assistant', 'stop_reason': 'end_turn', 'stop_sequence': None, 'type': 'message', 'usage': {'input_tokens': 17, 'output_tokens': 19}}

Now checking: claude-3-sonnet-20240229
Current key is XXXXX
{'id': 'msg_01YBnMnmDwax7289VZJZntvc', 'content': [{'text': "Hello! It's nice to meet you. How can I assist you today?", 'type': 'text'}], 'model': 'claude-3-opus-20240229', 'role': 'assistant', 'stop_reason': 'end_turn', 'stop_sequence': None, 'type': 'message', 'usage': {'input_tokens': 17, 'output_tokens': 19}}

Now checking: dbrx-instruct
Current key is XXXXX
{'detail': 'inference error'}
...
johnjosephhorton commented 5 months ago

Thank you - for your issue. 1) Needs to same Jupyter NB decorator to get this work 2) Need to re-set anthropic key 3) Don't print key in model check - mask front of it

rbyh commented 5 months ago

This message is long and it's not obvious what all it's telling me, as content varies by model. The output will also get longer as models are added. I think it needs some explanation to the user about what it is showing. A user may be expecting a short list of model names and yes/no whether stored key is missing/valid, similar to Model.available() succinct output. That could be a separate method.

Checking all available models...

Now checking: claude-3-haiku-20240307
Current key is sk-ant-a...
Error calling 'hello' on claude-3-haiku-20240307: Error code: 401 - {'type': 'error', 'error': {'type': 'authentication_error', 'message': 'invalid x-api-key'}}
Now checking: claude-3-opus-20240229
Current key is sk-ant-a...
Error calling 'hello' on claude-3-opus-20240229: Error code: 401 - {'type': 'error', 'error': {'type': 'authentication_error', 'message': 'invalid x-api-key'}}
Now checking: claude-3-sonnet-20240229
Current key is sk-ant-a...
Error calling 'hello' on claude-3-sonnet-20240229: Error code: 401 - {'type': 'error', 'error': {'type': 'authentication_error', 'message': 'invalid x-api-key'}}
Now checking: dbrx-instruct
Current key is KeIDxMJX...
{'detail': 'inference error'}

Now checking: gemini_pro
Current key is AIzaSyAs...
{'candidates': [{'content': {'parts': [{'text': 'Thank you for the compliment! I am happy to be of assistance. Please let me know if you have any questions or requests.'}], 'role': 'model'}, 'finishReason': 'STOP', 'index': 0, 'safetyRatings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE'}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE'}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE'}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE'}]}]}

Now checking: gpt-3.5-turbo
Current key is sk-HIpKf...
{'id': 'chatcmpl-9EGcNxOAcVkQ5SjEuuBtru9rcNets', 'choices': [{'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'content': 'Hello! How can I assist you today?', 'role': 'assistant', 'function_call': None, 'tool_calls': None}}], 'created': 1713187499, 'model': 'gpt-3.5-turbo-0125', 'object': 'chat.completion', 'system_fingerprint': 'fp_c2295e73ad', 'usage': {'completion_tokens': 9, 'prompt_tokens': 21, 'total_tokens': 30}}

Now checking: gpt-4-1106-preview
Current key is sk-HIpKf...
{'id': 'chatcmpl-9EGcNo2Cf9iCjA9neh0iwRcpuTUZ4', 'choices': [{'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'content': 'Hello! How can I assist you today?', 'role': 'assistant', 'function_call': None, 'tool_calls': None}}], 'created': 1713187499, 'model': 'gpt-4-1106-preview', 'object': 'chat.completion', 'system_fingerprint': 'fp_94f711dcf6', 'usage': {'completion_tokens': 9, 'prompt_tokens': 21, 'total_tokens': 30}}

Now checking: llama-2-13b-chat-hf
Current key is KeIDxMJX...
{'detail': {'error': 'Not authenticated'}}

Now checking: llama-2-70b-chat-hf
Current key is KeIDxMJX...
{'inference_status': {'runtime_ms': 2122, 'cost': 8.219999999999999e-05, 'tokens_generated': 61, 'tokens_input': 39}, 'results': [{'generated_text': "\n                I'm happy to assist you in any way I can. Is there something specific you need help with or would you like to chat about something in particular? I'm here to provide information, answer questions, or just provide a positive conversation. What can I do for you today?"}], 'num_tokens': 61, 'num_input_tokens': 39}

Now checking: mixtral-8x7B-instruct-v0.1
Current key is KeIDxMJX...
{'inference_status': {'runtime_ms': 3850, 'cost': 3.564e-05, 'tokens_generated': 91, 'tokens_input': 41}, 'results': [{'generated_text': "\n                I'm here to help you. What can I do for you today?\n\nPlease note that I am an AI language model, so I can't perform tasks that require real-world actions, such as turning on a light switch or browsing the internet. However, I can assist with answering questions, generating text, and providing information on a wide range of topics.\n\nLet me know how I can help!"}], 'num_tokens': 91, 'num_input_tokens': 41}
johnjosephhorton commented 5 months ago

There's a better version now, just FYI.