elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
635 stars 98 forks source link

[ML] Better memory estimation for NLP models #568

Closed valeriy42 closed 10 months ago

valeriy42 commented 1 year ago

This PR adds an ability to estimate per deployment and per allocation memory usage of NLP transformer models. It uses torch.profiler and performs logs the peak memory usage during the inference.

This information is then used in Elasticsearch to provision models with sufficient memory (elastic/elasticsearch#98874).

valeriy42 commented 1 year ago

@qherreros , since I ported your code for memory estimation, would you mind looking at the relevant part in transformers.py?

valeriy42 commented 10 months ago

@davidkyle could you please look at the PR since you have the best overview of the transformers.py. 🙏

valeriy42 commented 10 months ago

Thank you for the review @davidkyle . I addressed your comments, it would be great if you could take another look.

davidkyle commented 10 months ago

@pquentin do you know why the read the docs build has started failing

I notice the build is using Python 3.12.0 which could have something to do with it

Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [33 lines of output]
      Traceback (most recent call last):
        File "/home/docs/checkouts/readthedocs.org/user_builds/eland/envs/568/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/docs/checkouts/readthedocs.org/user_builds/eland/envs/568/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/docs/checkouts/readthedocs.org/user_builds/eland/envs/568/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 112, in get_requires_for_build_wheel
          backend = _build_backend()
                    ^^^^^^^^^^^^^^^^
        File "/home/docs/checkouts/readthedocs.org/user_builds/eland/envs/568/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 77, in _build_backend
          obj = import_module(mod_path)
                ^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/docs/.asdf/installs/python/3.12.0/lib/python3.12/importlib/__init__.py", line 90, in import_module
          return _bootstrap._gcd_import(name[level:], package, level)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "<frozen importlib._bootstrap>", line 1381, in _gcd_import
        File "<frozen importlib._bootstrap>", line 1354, in _find_and_load
        File "<frozen importlib._bootstrap>", line 1304, in _find_and_load_unlocked
        File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
        File "<frozen importlib._bootstrap>", line 1381, in _gcd_import
        File "<frozen importlib._bootstrap>", line 1354, in _find_and_load
        File "<frozen importlib._bootstrap>", line 1325, in _find_and_load_unlocked
        File "<frozen importlib._bootstrap>", line 929, in _load_unlocked
        File "<frozen importlib._bootstrap_external>", line 994, in exec_module
        File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
        File "/tmp/pip-build-env-4u84dhia/overlay/lib/python3.12/site-packages/setuptools/__init__.py", line 16, in <module>
          import setuptools.version
        File "/tmp/pip-build-env-4u84dhia/overlay/lib/python3.12/site-packages/setuptools/version.py", line 1, in <module>
          import pkg_resources
        File "/tmp/pip-build-env-4u84dhia/overlay/lib/python3.12/site-packages/pkg_resources/__init__.py", line 2172, in <module>
          register_finder(pkgutil.ImpImporter, find_on_path)
                          ^^^^^^^^^^^^^^^^^^^
      AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
pquentin commented 10 months ago

@pquentin do you know why the read the docs build has started failing

I notice the build is using Python 3.12.0 which could have something to do with it

Yes, we were using the latest Python version supported by Read the Docs and it recently became Python 3.12. It was failing to build numpy because we pin it to an older version that does not support Python 3.12. https://github.com/elastic/eland/pull/627 fixes this by asking Python 3.10 explicitly.

davidkyle commented 10 months ago

Thanks for fixing the docs build @pquentin