aws-samples / fm-leaderboarder

FM-Leaderboard-er allows you to create leaderboard to find the best LLM/prompt for your own business use case based on your data, task, prompts
Apache License 2.0
18 stars 5 forks source link

Pricing - adding pricing to leaderboard report #1

Closed gilinachum closed 6 months ago

gilinachum commented 6 months ago

Add Pricing Information to Leaderboard Report

As a user evaluating different LLMs, pricing information is key to understanding relevancy. Some models are very accurate but can be too expensive, or I might be willing to pick a more expensive model if the uplift in accuracy is high enough. Therefore, it would be useful to see both accuracy metrics and pricing on the same report.

For the first release, I suggest:

  1. Allow users to statically assign prices when defining the models in JSON (models_dict in the example ipynb). These prices will be included in the main leaderboard report table.
  2. For Bedrock models, automatically fetch prices from AWS's APIs (Maybe relevant: def price_information in here)

Notes:

  1. Pricing can be token-based (like Bedrock OpenData, or OpenAI API), or it can be based on uptime (Jumpstart endpoints, or Bedrock provisioned throughput).
  2. Pricing differs for input and output tokens. So, it's two values.

For future releases:

  1. Add an effective cost that takes into consideration the price and the size of the test set input and output in tokens.
  2. Add pricing for Jumpstart models or Bedrock provisioned throughput, based on some throughput calculation.