huggingface / optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
Apache License 2.0
177 stars 53 forks source link

(new feature) High level API to extract metadata of supported models and tasks #188

Open samir-souza opened 10 months ago

samir-souza commented 10 months ago

I created a workshop/tutorial for Amazon SageMaker + HF Optimum Neuron and in the notebooks I extract some metadata from Optimum Neuron to plot tables of compatible models for training and inference. However, to reach that info, you need to access a lot of Constants and moving parts, which makes the process complex and fragile (if the API changes, it can break this code). Here you can find the workshop: https://github.com/aws-samples/ml-specialized-hardware

It has 3 notebooks: data prep, fine tuning and deployment. In the fine tuning notebook I run this code to get the metadata: Optimum-neuron: 0.0.8

import re
import pandas as pd
from IPython.display import Markdown
from optimum.exporters.tasks import TasksManager
from optimum.exporters.neuron.model_configs import *
from optimum.neuron.distributed.parallelizers_manager import ParallelizersManager
from optimum.neuron.utils.training_utils import (
    _SUPPORTED_MODEL_NAMES,
    _SUPPORTED_MODEL_TYPES,
    _generate_supported_model_class_names
)

# retrieve supported models for Tensor Parallelism
tp_support = list(ParallelizersManager._MODEL_TYPE_TO_PARALLEL_MODEL_CLASS.keys())

# build compability table for training
data_training = {'Model': []}
for m in _SUPPORTED_MODEL_TYPES:
    if type(m) != str: m = m[0]
    if m=='gpt-2': m='gpt2' # fix the name
    model_id = len(data_training['Model'])
    model_link = f'<a target="_new" href="https://huggingface.co/models?sort=trending&search={m}">{m}</a>'
    data_training['Model'].append(f"{model_link} <font style='color: red;'><b>[TP]</b></font>" if m in tp_support else model_link)
    tasks = [re.sub(r'.+For(.+)', r'\1', t) for t in set(_generate_supported_model_class_names(m)) if not t.endswith('Model')]
    for t in tasks:
        if data_training.get(t) is None: data_training[t] = [''] * len(_SUPPORTED_MODEL_TYPES)
        data_training[t][model_id] = f'<a target="_new" href="https://huggingface.co/docs/transformers/model_doc/{m}#transformers.{m.title()}For{t}">api</a>'        
df_training = pd.DataFrame.from_dict(data_training).set_index('Model')

I also run something similar in the deployment/inference notebook.

You can see the results here: https://github.com/aws-samples/ml-specialized-hardware/blob/main/purpose-built-accelerators/docs/optimum_neuron_models.md

It would be good if you create a high-level API to do this job, something like:

In both methods, the models metadata should include TP=True/False to indicate it supports Tensor Parallelism or not. What do you think?

HuggingFaceDocBuilderDev commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

HuggingFaceDocBuilderDev commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

HuggingFaceDocBuilderDev commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

HuggingFaceDocBuilderDev commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

HuggingFaceDocBuilderDev commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!