huggingface / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
https://huggingface.co/docs/transformers.js
Apache License 2.0
11.78k stars 738 forks source link

[Question] Output always equal to Input in text-generation #206

Closed AndreEneva closed 1 year ago

AndreEneva commented 1 year ago

I tried a different types of input and always get the output equals the input... What I'm missing?

const answerer = await pipeline('text-generation', 'Xenova/LaMini-Cerebras-590M');

let zica = await answerer(`Based on this history:
André de Mattos Ferraz is an engineering manager in Rio de Janeiro, Brazil. He has worked in systems development in the oil sector, working in several areas of the oil/gas life cycle: Exploration, Reservoir, and Production. He also worked on data science projects for predicting failures of water injection pumps, forecasting water filter saturation (SRU), and analyzing vibrations.

What are André tech skills?`);
console.log(zica)

image

xenova commented 1 year ago

The original model can be found here: https://huggingface.co/MBZUAI/LaMini-Cerebras-590M and according to their README, it has been instruction finetuned and requires specific prompting. Here's the python version:

# pip install -q transformers
from transformers import pipeline

checkpoint = "{model_name}" 

model = pipeline('text-generation', model = checkpoint)

instruction = 'Please let me know your thoughts on the given place and why you think it deserves to be visited: \n"Barcelona, Spain"'

input_prompt = f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"

generated_text = model(input_prompt, max_length=512, do_sample=True)[0]['generated_text']

print("Response", generated_text)

I believe the reason it ends is because it immediately predicts the <|endoftext|> token with your current prompt. It might also be necessary to update specify the max_length parameter. Let me check.

mfandre commented 1 year ago

@xenova but how I can pass those extra parameters (max_length)? And I tried using this ###instructions and the result was the same.

xenova commented 1 year ago

You can do so as follows:

await answerer(input_prompt, {
    max_length: 512,
});

^^ I believe this is what you need :)...


e.g.,

const answerer = await pipeline('text-generation', 'Xenova/LaMini-Cerebras-590M');

const instruction = 'Please let me know your thoughts on the given place and why you think it deserves to be visited: \n"Barcelona, Spain"'

const input_prompt = `Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n${instruction}\n\n### Response: `
let zica = await answerer(input_prompt, {
    max_length: 512,
});
console.log(zica)

outputs

[
  {
    generated_text: 'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n' +
      '\n' +
      '### Instruction:\n' +
      'Please let me know your thoughts on the given place and why you think it deserves to be visited: \n' +
      '"Barcelona, Spain"\n' +
      '\n' +
      '### Response: \n' +
      '\n' +
      'I think Barcelona deserves to be visited because of its beautiful architecture, rich history, and diverse culture. The city is known for its beautiful beaches, vibrant nightlife, and delicious food. The city also has a rich history and is home to many famous landmarks, such as the Sagrada Familia and the Sagrada Familia. Additionally, Barcelona is known for its art and architecture, which is a 
testament to its artistic and cultural heritage.'
  }
]
mfandre commented 1 year ago

Worked! you are super @xenova !

xenova commented 1 year ago

Great! If you need it, here is the full list of generation parameters: https://huggingface.co/docs/transformers.js/main/en/api/utils/generation#new-generationconfigkwargs

Some of them are not yet be implemented (e.g., max_time), but the popular ones (like max_new_tokens or do_sample) should work! If one of them which you want to use is missing, feel free to open up a feature request.