Add decoder modeling - Githubissues

As per title!

[x] Focus on OPT, Llama, Mistral
[x] Add logits comparison tests with cpu runner
[x] Adds notification on slack
[ ] Docs
[ ] Fix tests: Tests fail because of issue 118

Example Usage:

from optimum.amd.ryzenai import RyzenAIModelForCausalLM
from transformers import AutoTokenizer
from tests.ryzenai.testing_utils import DEFAULT_VAIP_CONFIG_TRANSFORMERS

model_path = # OPT/LLama model quantized using Brevitas
vaip_config = DEFAULT_VAIP_CONFIG_TRANSFORMERS
model = RyzenAIModelForCausalLM.from_pretrained(model_path, vaip_config=vaip_config)
tokenizer= AutoTokenizer.from_pretrained(model_path)

prompt = "Hey, are you conscious? Can you talk to me?"
inputs = tokenizer(prompt, return_tensors="pt")

generated_text = model.generate(**inputs, max_new_tokens=30, do_sample=False)
print(generated_text)

huggingface / optimum-amd

Add decoder modeling #108