NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/index.html
Apache License 2.0
1.61k stars 256 forks source link

Deflect pip queries when querying .so location #855

Closed akoumpa closed 1 month ago

akoumpa commented 1 month ago

Description

As the title says, instead of querying pip to determine .so locations, this will try first try to look-up to directories from the init.py file and if the .so are found skip querying pip.

The motivation is to avoid querying pip if possible since it's slow (at least on a few machines I've tried).

Fixes # (issue)

Type of change

Changes

Please list the changes introduced in this PR:

Checklist:

akoumpa commented 1 month ago

@denera thanks for the review & the comment, I was not aware of PR760, as a result I think we can close this, assuming that 760 will be merged soon.

Before we close this, I have one question, if a user overwrites the used transformer_engine via PYTHONPATH will there be any issue with locating the correct .so? One case I had in mind when making this PR is that I can change the transformer_engine used (the python code via PYTHONPATH) but the pip would still use the .so from the globally installed TE, which would be incorrect. I know it's a corner case and perhaps not the most common case, but want to avoid someone spending time debugging something like this in the future.

Thanks again for the review!

timmoon10 commented 1 month ago

With https://github.com/NVIDIA/TransformerEngine/pull/760, loading the PyTorch C++ extensions (e.g. transformer_engine_extensions.cpython-310-x86_64-linux-gnu.so ) will also load the libtransformer_engine.so it was built with. If you set PYTHONPATH to a local TE build with transformer_engine_extensions, it will also load a local libtransformer_engine.so. If you set PYTHONPATH without building (e.g. if you only want to change the Python code without recompiling C++), it will load the globally-installed transformer_engine_extensions, which will load the globally-installed libtransformer_engine.so.

akoumpa commented 1 month ago

Thank you both for your responses, I think the issue I'm experiencing will be resolved in #760 and as I result I'm closing this one.