huggingface / transformers

šŸ¤— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.05k stars 26.3k forks source link

Cannot build documentation on Mac OS #32203

Closed jrhe closed 2 days ago

jrhe commented 1 month ago

System Info

Who can help?

@stevhliu - N.B. fix found and PR to be made on doc-builder. Raising issue here to document incase anyone else runs into it in the meanwhile.

Information

Tasks

Reproduction

Run steps to build documentation as described in https://github.com/huggingface/transformers/tree/main/docs on Mac OS.


Initial build docs for transformers docs/source/en/ /var/folders/g7/h9hst8551g74rd1jsf7txvj40000gn/T/tmpqf9yjhon/transformers/main/en
Building the MDX files:  49%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–Œ                                                                       | 209/430 [00:10<00:14, 15.64it/s]/Users/jon/repos/github.com/huggingface/transformers/src/transformers/deepspeed.py:24: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
  warnings.warn(
Building the MDX files:  58%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–                                                          | 248/430 [00:13<00:09, 18.72it/s]
Traceback (most recent call last):
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/build_doc.py", line 197, in build_mdx_files
    content, new_anchors, source_files, errors = resolve_autodoc(
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/build_doc.py", line 123, in resolve_autodoc
    doc = autodoc(
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/autodoc.py", line 490, in autodoc
    methods = find_documented_methods(obj)
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/autodoc.py", line 431, in find_documented_methods
    superclasses = clas.mro()[1:]
  File "/Users/jon/repos/github.com/huggingface/transformers/src/transformers/utils/import_utils.py", line 1526, in __getattribute__
    requires_backends(cls, cls._backends)
  File "/Users/jon/repos/github.com/huggingface/transformers/src/transformers/utils/import_utils.py", line 1514, in requires_backends
    raise ImportError("".join(failed))
ImportError:
TFBertTokenizer requires the tensorflow_text library but it was not found in your environment. You can install it with pip as
explained here: https://www.tensorflow.org/text/guide/tf_text_intro.
Please note that you may need to restart your runtime after installation.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/jon/.pyenv/versions/transformers/bin/doc-builder", line 8, in <module>
    sys.exit(main())
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/commands/doc_builder_cli.py", line 47, in main
    args.func(args)
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/commands/preview.py", line 175, in preview_command
    source_files_mapping = build_doc(
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/build_doc.py", line 367, in build_doc
    anchors_mapping, source_files_mapping = build_mdx_files(
  File "/Users/jon/.pyenv/versions/3.8.19/envs/transformers/lib/python3.8/site-packages/doc_builder/build_doc.py", line 230, in build_mdx_files
    raise type(e)(f"There was an error when converting {file} to the MDX format.\n" + e.args[0]) from e
ImportError: There was an error when converting docs/source/en/model_doc/bert.md to the MDX format.

TFBertTokenizer requires the tensorflow_text library but it was not found in your environment. You can install it with pip as
explained here: https://www.tensorflow.org/text/guide/tf_text_intro.
Please note that you may need to restart your runtime after installation.

tensorflow_text is unavailable on Mac OSX.

The error is the result of mro() being called by doc-builder's autodoc on the dummy TFBertTokenizer from transformers.utils.dummy_tensorflow_text_objects. This call results in __getattribute__ being called on DummyObject from transformers.utils.import_utils, which calls requires_backends that throws the ImportError.

Expected behavior

Either: 1) Docs can be built, without auto generated documentation for the platform specific dependencies. 2) Building documentation is not supported on macOS and is documented in https://github.com/huggingface/transformers/blob/main/docs/README.md

1 is probably preferable. Documentation of platform specific dependencies will still be built by CI on Github actions.

LysandreJik commented 1 month ago

@jrhe thanks for reporting the issue! The easiest way for you and other users to build documentation is likely to open a PR to the repo that changes the docs.

This will trigger a build and link you to the built docs. We unfortunately don't have the bandwidth to extensively test this on many different hardwares so we're happy for you to open PRs to test your doc changes.

jrhe commented 1 month ago

@LysandreJik Good to know. I think huggingface/doc-builder#515 should fix it anyway hopefully

github-actions[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.