neokd / NeoGPT

Chat effortlessly, execute commands, and interpret code with Llama3, Phi3, and more - your local AI assistant. Enjoy seamless interaction while ensuring ultimate privacy
https://neogpt.dev
MIT License
78 stars 65 forks source link

Update cost for togetherAI #187

Closed neokd closed 7 months ago

neokd commented 7 months ago

Update the JSON with cost for models in TogetherAI in callback_handler.py and add a flag called --max-budget in the main.py.

ayushmorbar commented 7 months ago

Assign me under JWoC, but before that could you provide me with the reference for the cost for models.

As here https://www.together.ai/pricing and https://docs.together.ai/docs/inference-models , its hard to understand the notation on how i should write the json key

TOGETHERAI_MODEL_COST_PER_1M_TOKENS = {
    "mistralai/Mistral-7B-Instruct-v0.2": 0.20,
    "mistralai/Mixtral-8x7B-Instruct-v0.1": 0.60,
}
neokd commented 7 months ago

If you sign up for together AI. On the models page you will find the model name and it's pricing. You can get get it from the table.

ayushmorbar commented 7 months ago

Here's the updated JSON with Chat, Language and Code models of Together.ai Review it beforehand so i can commit.

TOGETHERAI_MODEL_COST_PER_1M_TOKENS = {
    "deepseek-ai/deepseek-coder-33b-instruct":0.80,
    "Qwen/Qwen1.5-72B-Chat":0.90,
    "Qwen/Qwen1.5-14B-Chat":0.30,
    "Qwen/Qwen1.5-7B-Chat":0.20,
    "Qwen/Qwen1.5-4B-Chat":0.10,
    "Qwen/Qwen1.5-1.8B-Chat":0.10,
    "Qwen/Qwen1.5-0.5B-Chat":0.10,
    "codellama/CodeLlama-70b-Instruct-hf": 0.90,
        "mistralai/Mixtral-8x7B-Instruct-v0.1": 0.60,
    "mistralai/Mistral-7B-Instruct-v0.2": 0.20,
    "meta-llama/Llama-2-70b-chat-hf": 0.90,
    "snorkelai/Snorkel-Mistral-PairRM-DPO": 0.20,
    "codellama/CodeLlama-13b-Instruct-hf": 0.22,
    "codellama/CodeLlama-34b-Instruct-hf": 0.776,
    "codellama/CodeLlama-7b-Instruct-hf": 0.20,
    "NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO": 0.60,
    "NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT": 0.60,
    "NousResearch/Nous-Hermes-2-Yi-34B": 0.80,
    "openchat/openchat-3.5-1210": 0.20,
    "togethercomputer/StripedHyena-Nous-7B": 0.20,
    "DiscoResearch/DiscoLM-mixtral-8x7b-v2": 0.60,
    "mistralai/Mistral-7B-Instruct-v0.1": 0.20,
    "zero-one-ai/Yi-34B-Chat": 0.80,
    "NousResearch/Nous-Capybara-7B-V1p9": 0.20,
    "teknium/OpenHermes-2p5-Mistral-7B": 0.20,
    "upstage/SOLAR-10.7B-Instruct-v1.0": 0.30,
    "togethercomputer/llama-2-13b-chat": 0.22,
    "togethercomputer/llama-2-7b-chat": 0.20,
    "NousResearch/Nous-Hermes-Llama2-13b": 0.30,
    "NousResearch/Nous-Hermes-llama-2-7b": 0.20,
    "Open-Orca/Mistral-7B-OpenOrca": 0.20,
    "teknium/OpenHermes-2-Mistral-7B": 0.20,
    "WizardLM/WizardLM-13B-V1.2": 0.20,
    "togethercomputer/Llama-2-7B-32K-Instruct": 0.20,
    "lmsys/vicuna-13b-v1.5": 0.30,
    "Austism/chronos-hermes-13b": 0.30,
    "garage-bAInd/Platypus2-70B-instruct": 0.90,
    "Gryphe/MythoMax-L2-13b": 0.30,
    "togethercomputer/Qwen-7B-Chat": 0.20,
    "togethercomputer/RedPajama-INCITE-7B-Chat": 0.20,
    "togethercomputer/RedPajama-INCITE-Chat-3B-v1": 0.10,
    "togethercomputer/alpaca-7b": 0.20,
    "togethercomputer/falcon-40b-instruct": 0.80,
    "togethercomputer/falcon-7b-instruct": 0.20,

    "Qwen/Qwen1.5-72B": 0.90,
    "Qwen/Qwen1.5-14B": 0.30,
    "Qwen/Qwen1.5-7B": 0.20,
    "Qwen/Qwen1.5-4B": 0.10,
    "Qwen/Qwen1.5-1.8B": 0.10,
    "Qwen/Qwen1.5-0.5B": 0.10,
    "mistralai/Mixtral-8x7B-v0.1": 0.60,
    "meta-llama/Llama-2-70b-hf": 0.90,
    "togethercomputer/StripedHyena-Hessian-7B": 0.20,
    "mistralai/Mistral-7B-v0.1": 0.20,
    "microsoft/phi-2": 0.10,
    "zero-one-ai/Yi-34B": 0.80,
    "zero-one-ai/Yi-6B": 0.14,
    "Nexusflow/NexusRaven-V2-13B": 0.30,
    "togethercomputer/LLaMA-2-7B-32K": 0.20,
    "togethercomputer/llama-2-13b": 0.22,
    "togethercomputer/llama-2-7b": 0.20,
    "togethercomputer/Qwen-7B": 0.20,
    "togethercomputer/RedPajama-INCITE-7B-Instruct": 0.20,
    "togethercomputer/RedPajama-INCITE-7B-Base": 0.20,
    "togethercomputer/RedPajama-INCITE-Instruct-3B-v1": 0.10,
    "togethercomputer/RedPajama-INCITE-Base-3B-v1": 0.10,
    "togethercomputer/GPT-JT-Moderation-6B": 0.20,
    "togethercomputer/falcon-40b": 0.80,
    "togethercomputer/falcon-7b": 0.20,

    "codellama/CodeLlama-70b-Python-hf": 0.90,
    "codellama/CodeLlama-70b-hf": 0.90,
    "codellama/CodeLlama-13b-Python-hf": 0.22,
    "codellama/CodeLlama-34b-Python-hf": 0.776,
    "codellama/CodeLlama-7b-Python-hf": 0.20,
    "WizardLM/WizardCoder-Python-34B-V1.0": 0.80,
    "Phind/Phind-CodeLlama-34B-v2": 0.80,
}
neokd commented 7 months ago

How many models are there?

ayushmorbar commented 7 months ago

Around 77 (including the previous to mistral ai models which were already there in the JSON) and excluding the Image Models, cuz the Cost was variable.

ayushmorbar commented 7 months ago

Also, is there any default value for --MAX-BUDGET flag in main.py

neokd commented 7 months ago

no by default let it be none

neokd commented 7 months ago

@ayushmorbar any updates on this?

ayushmorbar commented 7 months ago

Yeah wait, I am about to commit the code in a while, was studying for the exams~

neokd commented 7 months ago

Thik hai

ayushmorbar commented 7 months ago

Done.