transducens / demint

Repository for the project "DeMINT: Automated Language Debriefing for English Learners via AI Chatbot Analysis of Meeting Transcripts"
Apache License 2.0
3 stars 0 forks source link

Agregar LLM que explica los errores en una oración. #5

Closed levnikolaevich closed 4 months ago

levnikolaevich commented 5 months ago

Datos de entrada /Oración/

Respuesta: The correct sentence would be as follows: /Oración/

Errors: 1. 2. 3.

Como opción, devolver la respuesta en json (como lo hace LangTools).

==============

levnikolaevich commented 5 months ago

DSPy in 8 Steps

1. Define your task.

llm ["microsoft/Phi-3-mini-4k-instruct","google/gemma-1.1-2b-it", "google/gemma-1.1-7b-it"]

We aim to use a language model for correcting and explaining errors in sentences written in English. The desired response format is JSON.

Input examples: She don't know what to do next. The team is needing a new coach for the next season.

Response examples:

 {
    "original_sentence": "She don't know what to do next.",
    "llm_corrected_sentence": "She doesn't know what to do next.",
    "llm_error_explanation": "The original sentence used 'don't' with a singular subject, which is incorrect. The correct verb form for the third person singular is 'doesn't'."
  }
  {
    "original_sentence": "The team is needing a new coach for the next season.",
    "llm_corrected_sentence": "The team needs a new coach for the next season.",
    "llm_error_explanation": "The original sentence used 'is needing', which is not standard because 'need' typically does not use the continuous form. 'Needs' is the correct form here."
  }

One of the requirements for the task is to use language models with open weights and a small number of parameters (up to 8 million) in order to run on less demanding hardware. Since error analysis can be organized in the background, the response time for each error can be limited to 1-5 minutes.

2. Define your pipeline. At the first stage, it is planned to use a simple program called dspy with the module e dspy.ChainofThought.

================= we are here ================= -->Then write your (initial) DSPy program. Again: start simple, and let the next few steps guide any complexity you will add.

  1. Explore a few examples.
  2. Define your data.
  3. Define your metric.
  4. Collect preliminary "zero-shot" evaluations.
  5. Compile with a DSPy optimizer.
  6. Iterate.
levnikolaevich commented 5 months ago

I've created an example of DSPy So far, I am unable to correctly distribute the response from the LLM across various fields in accordance with the specified signature.

So, I have made the different questions to developers

levnikolaevich commented 5 months ago
  1. Explore a few examples.
  2. Define your data.
  3. Define your metric.
  4. Collect preliminary "zero-shot" evaluations.
  5. Compile with a DSPy optimizer.

================= we are here =================

  1. Iterate.
levnikolaevich commented 5 months ago

The developer's answer

Instead of HFModel we should use HFModelTGI or VLLM https://github.com/stanfordnlp/dspy/issues/823

levnikolaevich commented 5 months ago

TGI client

  1. git clone https://github.com/huggingface/text-generation-inference.git && cd text-generation-inference
  2. docker run --gpus all --shm-size 1g -p 8083:80 -v D:/Development/UNIVERSIDAD/BECA/tgi:/data -e HUGGING_FACE_HUB_TOKEN=<token> ghcr.io/huggingface/text-generation-inference:0.9 --model-id meta-llama/Meta-Llama-3-8B-Instruct --num-shard 1

Token https://huggingface.co/settings/tokens