RyderSwanson / LLMEval

Other
0 stars 0 forks source link

Compile a list of the major LLMs #1

Closed RyderSwanson closed 1 month ago

RyderSwanson commented 2 months ago

As a researcher I need to compile a list of major LLMs currently available So that we can identify which ones to include in our evaluation program

Details and Assumptions

Acceptance Criteria

Given I have access to information about current LLMs
When I compile the list of major LLMs
Then the list should include at least 10 different models
And each entry should have the model name, provider, and key features
TankEngine1234 commented 1 month ago

Proprietary Models:

  1. GPT-4 (OpenAI)

    • Capabilities: Strong in text generation, summarization, coding, translation, and multi-turn dialogue. Highly versatile across domains and known for its reasoning and instruction-following capabilities.
    • Use Cases: Integrated into various applications like chatbots, coding assistants, and content generation tools.
    • Access: Available via OpenAI's API or platforms like ChatGPT.
  2. Gemini 1.5 (Google DeepMind)

    • Capabilities: Advanced multimodal abilities, understanding both text and images, with strong reasoning and coding functions. Competitive with GPT-4 in instruction following and question answering.
    • Use Cases: Ideal for developing AI systems that handle both text and visual inputs.
    • Access: Access via Google Cloud services.
  3. Claude 3 (Anthropic)

    • Capabilities: Safety-focused, designed to avoid harmful or biased outputs. It excels in understanding instructions, providing answers in a way that ensures user safety.
    • Use Cases: Useful in environments needing highly controlled outputs, such as educational tools or customer service.
    • Access: Available through Anthropic’s API.
  4. Gemma 2 (Google)

    • Capabilities: Focused on safe and ethical applications, such as preventing misuse for criminal purposes. It has strong performance in text generation and summarization.
    • Use Cases: AI tools for businesses and developers.
    • Access: Available via Google AI services.

Open-Source Models:

  1. LLaMA 3 (Meta)

    • Capabilities: Excellent for creative tasks, coding, logical reasoning, and multilingual performance. It can handle summarization and follow detailed instructions.
    • Use Cases: Great for developers needing customizable models for research and application building.
    • Access: Available on Hugging Face, with variations optimized for different tasks.
    • License: Open but with some restrictions on commercial use.
  2. BLOOM (BigScience/Hugging Face)

    • Capabilities: Strong in multilingual tasks (supports 46 languages), text generation, summarization, translation, and code generation. It's known for its open-access nature and responsible AI development.
    • Use Cases: Open research, multilingual content generation, and academic projects.
    • Access: Fully open-source, available for download and fine-tuning on Hugging Face.
  3. MPT-7B (MosaicML)

    • Capabilities: Specialized for commercial applications with models like MPT-7B-Instruct (for instructions), MPT-7B-Chat (for dialogue), and MPT-7B-StoryWriter (for long-form writing). It excels in predictive analytics and decision-making tasks.
    • Use Cases: Businesses needing a customizable model for commercial uses.
    • Access: Available open-source with commercial licensing.