LLM-Research - Githubissues

Research into possible LLM models: Mistral, Llama (Meta), GPT@Azure, Gemini, CSC LUMI, etc.

Possible to combine multiple LLM models?

Asking ChatGPT with the following prompt:

"I am creating an LLM-based text classifier. The purpose of this project is to input text-based data and have the program analyse and classify themes and other things from the data. The program will run locally and will likely be created using Python, and will also use Docker or Podman. What LLM should I use for this task? My choices are Mistral, Llama, GPT@Azure, Gemini and CSC Lumi. Consider the upsides and downsides of each LLM and suggest one or more for my use case."

The answer:

For your LLM-based text classifier project, here’s a breakdown of the five LLMs you’re considering, focusing on their key strengths and weaknesses to help guide your decision. Your goal is to classify themes and analyze text data locally, likely using Python with Docker/Podman, so compatibility, ease of integration, performance, and licensing are essential considerations.

1. Mistral

Upsides:
- Open-Source: Mistral is fully open-source, which gives you greater control over deployment and customization. It is easy to integrate with Docker or Podman and run locally.
- Lightweight and Efficient: Known for having a smaller model size but competitive performance, making it more resource-efficient. This is beneficial for running locally without requiring extensive hardware.
- Strong NLP Capabilities: Mistral models have been shown to perform well on common NLP tasks, including text classification and analysis.
Downsides:
- Less Community Support: Mistral is relatively new and less established compared to LLaMA or GPT models, which might result in fewer community-contributed resources or tools.
- Limited Fine-tuning Support: While it’s open-source, you may need more effort to fine-tune it compared to more mainstream models like LLaMA or GPT.
Best for: If you want an open-source model that is efficient and can be run with fewer resources locally.

2. LLaMA (Llama 2)

Upsides:
- Open-Source: LLaMA 2 is Meta’s open-source model with good performance on NLP tasks and great flexibility for local deployment.
- Strong NLP Abilities: LLaMA 2 excels at various tasks, including text classification and theme extraction. It is well-suited for deep text analysis and custom classifiers.
- Large Community and Tooling: The popularity of LLaMA has led to robust community support, resources, and existing tools that make it easier to integrate into Python-based projects with Docker/Podman.
- Fine-tuning: Fine-tuning capabilities are well-documented, and there are existing pipelines for specific use cases like classification.
Downsides:
- Resource-Intensive: LLaMA 2’s larger models might require more computational resources. This could be an issue if your hardware is limited.
- Licensing: While open-source, LLaMA 2 has restrictions on commercial use, which might be a consideration depending on your project.
Best for: If you need a widely-used, open-source, and well-documented model with strong NLP capabilities and flexibility for local deployment.

3. GPT@Azure (Azure OpenAI Service)

Upsides:
- State-of-the-Art Performance: GPT models, including GPT-4, have the best-in-class performance across NLP tasks like text classification, theme extraction, and more. They are highly accurate and reliable.
- Seamless Cloud Integration: If you plan to scale or require cloud support, GPT@Azure offers tight integration with Azure’s ecosystem, making it convenient for deployment and scaling.
- Fine-tuning and Pre-built APIs: Azure offers robust APIs for classification and sentiment analysis, along with fine-tuning capabilities.
Downsides:
- Cost: Running GPT models through Azure OpenAI is expensive, especially for high-volume tasks. Since your program runs locally, this could be a significant drawback.
- Limited Local Usage: While you can use the Azure API, running GPT models entirely locally may not be feasible due to the lack of a fully downloadable open-source version of GPT models from OpenAI.
- Privacy: You must send data to Azure for analysis, which may not meet your privacy or local control requirements.
Best for: If you need cutting-edge performance and are okay with using Azure’s cloud infrastructure (but this doesn’t align with your local-first approach).

4. Gemini

Upsides:
- Google-Backed Model: Gemini (by Google DeepMind) provides strong performance in various NLP tasks and can handle complex themes, sentiments, and contextual understanding.
- Multilingual Capabilities: Gemini has been optimized for handling multiple languages, making it useful for diverse datasets.
Downsides:
- Cloud-First: Similar to GPT@Azure, Gemini is a cloud-first service, meaning that running the model locally might not be possible or practical.
- Privacy Concerns: Data would have to be sent to Google’s cloud infrastructure, which may be a concern if you’re aiming for local execution.
- Cost: Like GPT@Azure, using Gemini in the cloud can become expensive, especially if you need heavy API calls for large-scale data analysis.
Best for: If you need multilingual support and are open to using Google Cloud (but again, this doesn’t align with your local deployment needs).

5. CSC Lumi

Upsides:
- High-Performance Computing (HPC) Integration: Lumi is one of the most powerful supercomputers in Europe, offering strong computational support. It’s ideal for large-scale data processing and advanced AI training.
- Open-Source and European Regulations: Lumi may offer models that adhere to European data regulations, which is an advantage for privacy-conscious projects.
- Scalability: If your project scales up, Lumi can provide the computational resources for more complex tasks.
Downsides:
- Limited Availability: Since Lumi is an HPC resource, access might be restricted, and setting up a local instance could be more complex than other models. It’s primarily designed for high-performance remote or large-scale processing.
- Resource Overhead: Deploying CSC Lumi could be overkill if you don’t require HPC-level processing power. Running it locally would be challenging without sufficient infrastructure.
Best for: Large-scale projects needing HPC resources, but may not be a great fit for your local-first and Docker-based setup.

Recommendation

For your use case of a locally-deployed text classifier, focusing on ease of integration with Python, Docker/Podman, and a preference for local execution, LLaMA 2 and Mistral are the top choices:

LLaMA 2: Offers the best balance between performance, community support, and local deployment. It has strong text classification capabilities, with a wide range of tools and libraries that will make integration easier. LLaMA 2 is open-source but may have some licensing restrictions on commercial use.
Mistral: A great lightweight, open-source option that could be more resource-efficient if your hardware is limited. Mistral is highly suitable if you want a more agile model that can still perform well for NLP tasks like classification, while running locally with minimal overhead.

Not Recommended:

GPT@Azure and Gemini: Both provide superior performance but are cloud-first and costly, which conflicts with your local deployment requirement.
CSC Lumi: While powerful, it is more suited for high-performance computing tasks and large-scale deployments, not a lightweight local solution.

Between LLaMA 2 and Mistral, LLaMA 2 is likely to give you stronger results for text classification if you have the hardware to support it, while Mistral is better for a resource-efficient, open-source, local solution.

Asking Gemini with the same prompt:

Choosing an LLM for Text Classification

Understanding your requirements:

Local deployment: This limits options to models that are available for local use or can be easily downloaded.
Python: Most LLMs have Python libraries or APIs, making integration straightforward.
Docker or Podman: These containerization tools simplify deployment and management.

Comparing LLMs:

Mistral:
- Pros: Open-source, customizable, and can be trained on specific datasets.
- Cons: Might require more computational resources for fine-tuning, especially for larger models.
Llama:
- Pros: Open-source, popular, and has a large community.
- Cons: Similar to Mistral, might need more computational resources for fine-tuning.
GPT@Azure:
- Pros: Cloud-based, scalable, and offers various APIs and services.
- Cons: Requires an Azure subscription, might have limitations on customization, and could be more expensive for large-scale applications.
Gemini:
- Pros: Powerful, versatile, and designed for complex tasks.
- Cons: Might be more resource-intensive, and commercial licensing might apply.
CSC Lumi:
- Pros: Access to high-performance computing resources.
- Cons: Primarily for research purposes, might require specific expertise to use effectively.

Recommendations:

Given your requirements for local deployment, Python integration, and containerization, Mistral or Llama would be excellent choices. Both are open-source, well-supported, and can be fine-tuned for your specific text classification task.

Additional considerations:

Model size: Consider the size of your dataset and the complexity of your classification task. Larger models might provide better performance but require more computational resources.
Fine-tuning: If you have a specific dataset or task, fine-tuning the model can significantly improve its performance.
Cost: For cloud-based options like GPT@Azure, evaluate the cost implications based on your expected usage.

By carefully considering these factors, you can select the LLM that best suits your text classification project and provides the desired performance and flexibility.

When asking Mistral:

Choosing the right Large Language Model (LLM) for your text classifier project depends on several factors, including performance, ease of use, cost, and specific requirements. Let's evaluate the options you've mentioned:

Mistral

Upsides:

Open Source: Mistral is often open-source, which means you have full control over the model and can customize it as needed.
Customization: You can fine-tune the model to better suit your specific use case.
Cost: No direct costs for using the model, although you may need computational resources.

Downsides:

Complexity: Requires more technical expertise to set up and fine-tune.
Performance: May not perform as well as some of the more advanced proprietary models without significant fine-tuning.

Llama

Upsides:

Open Source: Similar to Mistral, Llama is open-source and allows for customization.
Community Support: Strong community support and a growing ecosystem of tools and resources.
Cost: No direct costs for using the model.

Downsides:

Complexity: Requires technical expertise to set up and fine-tune.
Performance: May require significant fine-tuning to achieve high performance.

GPT@Azure

Upsides:

Performance: Generally offers high performance and is backed by Microsoft's infrastructure.
Ease of Use: Easy to integrate with Azure services, which can simplify deployment and scaling.
Support: Comprehensive support and documentation from Microsoft.

Downsides:

Cost: Can be expensive, especially for large-scale use.
Vendor Lock-in: You are tied to Azure's ecosystem, which may limit flexibility.

Gemini

Upsides:

Performance: Often designed for high performance and efficiency.
Customization: May offer some level of customization depending on the provider.
Support: Potential for good support and documentation.

Downsides:

Cost: Can be expensive, especially for large-scale use.
Availability: May not be as widely available or well-known as other options.

CSC Lumi

Upsides:

Performance: Likely designed for high performance, especially if it's a specialized model.
Customization: May offer some level of customization depending on the provider.
Support: Potential for good support and documentation.

Downsides:

Cost: Can be expensive, especially for large-scale use.
Availability: May not be as widely available or well-known as other options.

Recommendations

For Open Source and Customization:
- Mistral or Llama: If you have the technical expertise and want full control over the model, Mistral or Llama are excellent choices. They are open-source, allowing for extensive customization and fine-tuning.
For Ease of Use and Performance:
- GPT@Azure: If you prefer a more straightforward integration and are willing to pay for high performance and comprehensive support, GPT@Azure is a strong contender. It's backed by Microsoft's infrastructure and offers robust performance.
For Specialized Needs:
- Gemini or CSC Lumi: If you have specific performance or customization needs that are not met by the more general models, Gemini or CSC Lumi might be worth exploring. However, you should carefully evaluate their availability and costs.

Final Suggestion

Given your use case of running the program locally (with cloud solutions possible) and using Python along with Docker or Podman, Mistral or Llama would be excellent choices if you have the technical expertise to set them up and fine-tune them. They offer the flexibility and customization you need without direct costs.

If you prefer a more managed solution with high performance and are willing to pay for it, GPT@Azure is a strong option. It simplifies deployment and scaling, and you can leverage Microsoft's infrastructure and support.

Ultimately, the choice depends on your specific requirements, budget, and technical capabilities.

Preliminary research seems to point to Mistral, next up is more research on how to use it.

ChatGPT on how to set up Mistral using Python & Podman:

To create an LLM (Large Language Model) program using Python, Podman, and the Mistral LLM on Windows, you'll need to follow a series of steps to set up the environment. Here's a detailed guide on what you need and how to set it up:

1. Install Python

Ensure that Python is installed on your Windows machine. You can download the latest version of Python from the official website: Python.org. During installation, make sure to select the option to Add Python to PATH.

2. Install Podman on Windows

Podman is a containerization platform similar to Docker but designed to be daemonless. Here's how to install Podman on Windows:

Steps to Install Podman:

Download Podman: Visit the official Podman Windows installation page and follow the steps to install Podman for Windows.
WSL2 Installation: Podman on Windows requires Windows Subsystem for Linux 2 (WSL2). If you don’t have WSL2 installed, follow these steps:
- Open PowerShell as Administrator and run:
```
wsl --install
```
- This command will install WSL2 along with the Ubuntu Linux distribution by default. After the installation, restart your computer if prompted.
Set Up WSL2 for Podman:
- Once WSL2 is installed, open PowerShell again and run:
```
wsl --set-version <your-distro> 2
```
  Replace <your-distro> with the name of your installed WSL Linux distribution.
Install Podman Desktop: Download and install Podman Desktop from Podman Desktop releases to manage containers graphically and use Podman more effectively.
Run Podman: Open Podman Desktop or use the command line inside the WSL terminal to run Podman commands.

3. Install Mistral LLM in a Podman Container

Once Podman is set up, you can create and run a container with the Mistral LLM model inside it.

Steps to Run Mistral LLM on Podman:

Search for Mistral LLM Image: First, search for the Mistral LLM container image, or create your own Dockerfile if no prebuilt image is available.
- Use the Podman CLI to search for Mistral LLM:
```
podman search mistral
```
- If no pre-built images are available, you may need to create a Dockerfile that sets up Mistral LLM from source or another repository.
Pull the Mistral Image or Build from Dockerfile:
- Pull the image (if available):
```
podman pull mistral/mistral-llm:latest
```
- Or build the image from a Dockerfile (if you create one for Mistral):
```
podman build -t mistral-llm .
```
Run the Container:
- Start a container with the Mistral LLM:
```
podman run -it --rm mistral/mistral-llm:latest
```
- This will download and start the Mistral model in a containerized environment.

4. Set Up Python to Interact with Mistral LLM

Now that Mistral LLM is running inside a Podman container, you'll need to set up Python on Windows to interact with the model.

Install Required Python Libraries:

Install requests or another HTTP library (if you will interact with the LLM via an API):
```
pip install requests
```

Set Up a Python Script to Connect with Mistral:

If Mistral LLM exposes an API, you can write a Python script to send requests to the Mistral instance running inside the Podman container. Here's an example:

import requests

# Replace with the correct URL for the running Mistral LLM API
url = "http://localhost:8000/api/v1/generate"

headers = {"Content-Type": "application/json"}
data = {
   "prompt": "What is the capital of France?",
   "max_tokens": 50
}

response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
   print(response.json())
else:
   print(f"Error: {response.status_code}")

Exposing Ports from the Podman Container: Ensure that the Mistral LLM container exposes the necessary ports so that Python on Windows can communicate with it. For example:
```
podman run -it -p 8000:8000 mistral/mistral-llm:latest
```

5. Testing the Setup

Once everything is set up:

Run the Mistral LLM container using Podman.
Use your Python script to send a request and get a response from the model.

6. Automation with Scripts

To automate your setup, you can write a batch or PowerShell script to launch the container and your Python script in sequence.

Summary of Tools and Libraries:

Python (with necessary libraries like requests)
Podman (to run the containerized environment)
Mistral LLM (to run the LLM model)
Windows Subsystem for Linux 2 (to ensure compatibility with Podman)

With this setup, you'll have a containerized LLM (Mistral) running in Podman on your Windows machine, and you'll use Python to interact with it via an API or other interface.

I also had the same kind of discussion with ChatGPT as Vertti and came to the same conclusion that Mistral & LLAMA are the best options.

Since LLAMA and Gemini are developed by Meta's and Google's it is known how these companies make money. So I leaned towards Mistral on this one myself. Here is some conversation with ChatGPT about the subject.

"Mistral AI models, being open-weight and designed for use in local or private settings, do not collect user data by default. The responsibility for data privacy lies with the developer or organization using the model. Unlike larger companies like Meta or Google, Mistral AI does not operate a centralized advertising or data-sharing ecosystem, and there’s no indication that it collects or sells user data for marketing purposes."

My only question is: is mistral free to use? While researching the topic, I came across information that it would no longer be free after August 2024.

While researching LLM integration I came across several ways and wondered if, for example, a modular approach or ensemble learning techinique would be good for this project?

Mistral is not free. The price is $2 per 1 million tokens for input and $6 per 1 million for output for the best model and $0,15 per 1 million tokens for the new Pixtral 12B version.

Llama 3.1 price varies (https://www.llama.com/)

Running some models locally is possible but takes heavy amounts of GPU VRAM (https://ollama.com/library)

Results:

LLM will be from a cloud service, probably either AWS or Azure
Final LLM model choices: Llama 3.1 or Mistral, but other models (GPT) also possible. Will ask the client for their opinion.
Main points: capabilities (parameter counts, GPT-4 probably best) vs pricing. Will also ask the client.

Client opinion:

Combined model: Able to choose between a good (maybe OpenAI) model and a lighter, maybe local, lighter (Mistral) model
Look into Aalto University model, but don't spend too much time
Test the possibility of local-run models for a while, but don't spend too much time if it doesn't work
100 credits should last for a long time with our usage, so Azure and its free credits is a good choice.

ProjectCED / CED-LLM

LLM-Research #7

1. Mistral

2. LLaMA (Llama 2)

3. GPT@Azure (Azure OpenAI Service)

4. Gemini

5. CSC Lumi

Recommendation

Not Recommended:

Choosing an LLM for Text Classification

Mistral

Llama

GPT@Azure

Gemini

CSC Lumi

Recommendations

Final Suggestion

1. Install Python

2. Install Podman on Windows

Steps to Install Podman:

3. Install Mistral LLM in a Podman Container

Steps to Run Mistral LLM on Podman:

4. Set Up Python to Interact with Mistral LLM

Install Required Python Libraries:

5. Testing the Setup

6. Automation with Scripts

Summary of Tools and Libraries: