huggingface / optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
Apache License 2.0
176 stars 51 forks source link

Optimum neuron inference test results with NotImplementedError. #571

Open musunita opened 2 months ago

musunita commented 2 months ago

System Info

Hugging Face Neuron Deep Learning AMI (Ubuntu 22.04)[ami=ami-073e0687022c65b38 ]

Instance type: Inf2.48xlarge

pre-installed package:

aws_neuron_venv_pytorch) ubuntu@ip-172-31-9-88:~$ pip list | grep "neuron"
aws-neuronx-runtime-discovery 2.9
libneuronxla                  0.5.971
neuronx-cc                    2.13.66.0+6dfecc895
neuronx-distributed           0.7.0
neuronx-hwm                   2.12.0.0+422c9037c
optimum-neuron                0.0.21
tensorboard-plugin-neuronx    2.6.7.0
torch-neuronx                 1.13.1.1.14.0
torch-xla                     1.13.1+torchneurone
transformers-neuronx          0.10.0.21
Using 0.0.21 version of optimum-neuron
aws_neuron_venv_pytorch) ubuntu@ip-172-31-9-88:~/optimum-neuron$ git branch
* (HEAD detached at v0.0.21)

Who can help?

@dacorvo, @JingyaHuang,

Inference test details is enclosed in tests in optimum-neuron.pdf. Tests in Optimum Neuron (3) (4).pdf

While running inference tests on INf2.48xlarge, encounter following errors.

(aws_neuron_venv_pytorch) ubuntu@ip-172-31-9-88:~/optimum-neuron$ pytest -m is_inferentia_test tests
=================================================================================== test session starts ===================================================================================
platform linux -- Python 3.8.10, pytest-8.0.0, pluggy-1.4.0
rootdir: /home/ubuntu/optimum-neuron
configfile: pyproject.toml
plugins: anyio-4.2.0
collected 1134 items / 1 error / 665 deselected / 469 selected                                                                                                                            

ERRORS ==========================================================================================
_________________________________________________________________________ ERROR collecting tests/test_examples.py _________________________________________________________________________
tests/test_examples.py:514: in <module>
    class CausalLMExampleTester(ExampleTesterBase, metaclass=ExampleTestMeta, example_name="run_clm"):
tests/test_examples.py:288: in __new__
    pp_support = ParallelizersManager.parallelizer_for_model(model_type).supports_pipeline_parallelism()
optimum/neuron/distributed/parallelizers_manager.py:110: in parallelizer_for_model
    raise NotImplementedError(
E   NotImplementedError: gpt2 is not supported for parallelization, supported models: bert, roberta, gpt_neo, gpt_neox, llama, mistral, t5
==================================================================================== warnings summary =====================================================================================
../../../opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/commands/env.py:19
  /opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/commands/env.py:19: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
    from pkg_resources import get_distribution

../../../opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/pkg_resources/__init__.py:2846
../../../opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/pkg_resources/__init__.py:2846
  /opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/pkg_resources/__init__.py:2846: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../../opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/pkg_resources/__init__.py:2846
  /opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/pkg_resources/__init__.py:2846: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../../opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/pkg_resources/__init__.py:2846
  /opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/pkg_resources/__init__.py:2846: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('zope')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================= short test summary info =================================================================================
ERROR tests/test_examples.py - NotImplementedError: gpt2 is not supported for parallelization, supported models: bert, roberta, gpt_neo, gpt_neox, llama, mistral, t5
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
====================================================================== 665 deselected, 5 warnings, 1 error in 3.41s ======================================================================

Information

Tasks

Reproduction (minimal, reproducible, runnable)

steps: Run tests a. pytest -m is_inferentia_test tests

Expected behavior

Expect to see inference suite passing.

dacorvo commented 2 months ago

cc @michaelbenayoun

musunita commented 2 months ago

Thank for the fix. I tried the fix and I dont see above errors anymore which is great. But I encounter a new error now as below.

No input shapes provided, using default shapes, {'batch_size': 1, 'sequence_length': 128} ___ test_export_no_parameters[feature-extraction-hf-internal-testing/tiny-random-XLMModel] ___

self = <optimum.neuron.utils.hub_neuronx_cache.CompileCacheHfProxy object at 0x7f0d8009f850>, repo_id = 'optimum-internal-testing/optimum-neuron-cache-for-testing' default_cache = <libneuronxla.neuron_cc_cache.CompileCacheFs object at 0x7f0f393d7c70>, endpoint = None, token = None

def __init__(
    self, repo_id: str, default_cache: CompileCache, endpoint: Optional[str] = None, token: Optional[str] = None
):
    # Initialize the proxy cache as expected by the parent class
    super().__init__(default_cache.cache_url)
    self.cache_path = default_cache.cache_path
    # Initialize specific members
    self.default_cache = default_cache
    self.api = HfApi(endpoint=endpoint, token=token, library_name="optimum-neuron", library_version=__version__)
    # Check if the HF cache id is valid
    try:
        if not self.api.repo_exists(repo_id):
          raise ValueError(f"The {repo_id} repository does not exist or you don't have access to it.")

E ValueError: The optimum-internal-testing/optimum-neuron-cache-for-testing repository does not exist or you don't have access to it.

optimum/neuron/utils/hub_neuronx_cache.py:116: ValueError

michaelbenayoun commented 2 months ago

You need to set the CUSTOM_CACHE_REPO environment variable with a Hub repo to use as a cache repo and then set the HF_TOKEN environment variable with a token that has write access to the repo.

musunita commented 2 months ago

Thanks Michael. There are around 35 outstanding failures. Attached logs here. [Uploading outinf2.txt…]()

Would you please recommend next steps on the failures.