akoumpa commented 1 month ago

Description

As the title says, instead of querying pip to determine .so locations, this will try first try to look-up to directories from the init.py file and if the .so are found skip querying pip.

The motivation is to avoid querying pip if possible since it's slow (at least on a few machines I've tried).

Fixes # (issue)

Type of change

[ ] Documentation change (change only to the documentation, either a fix or a new content)
[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)

Changes

Please list the changes introduced in this PR:

Change A
Change B

Checklist:

[ ] I have read and followed the contributing guidelines
[ ] The functionality is complete
[ ] I have commented my code, particularly in hard-to-understand areas
[ ] I have made corresponding changes to the documentation
[ ] My changes generate no new warnings
[ ] I have added tests that prove my fix is effective or that my feature works
[ ] New and existing unit tests pass locally with my changes

akoumpa commented 1 month ago

@denera thanks for the review & the comment, I was not aware of PR760, as a result I think we can close this, assuming that 760 will be merged soon.

Before we close this, I have one question, if a user overwrites the used transformer_engine via PYTHONPATH will there be any issue with locating the correct .so? One case I had in mind when making this PR is that I can change the transformer_engine used (the python code via PYTHONPATH) but the pip would still use the .so from the globally installed TE, which would be incorrect. I know it's a corner case and perhaps not the most common case, but want to avoid someone spending time debugging something like this in the future.

Thanks again for the review!

timmoon10 commented 1 month ago

With https://github.com/NVIDIA/TransformerEngine/pull/760, loading the PyTorch C++ extensions (e.g. transformer_engine_extensions.cpython-310-x86_64-linux-gnu.so ) will also load the libtransformer_engine.so it was built with. If you set PYTHONPATH to a local TE build with transformer_engine_extensions, it will also load a local libtransformer_engine.so. If you set PYTHONPATH without building (e.g. if you only want to change the Python code without recompiling C++), it will load the globally-installed transformer_engine_extensions, which will load the globally-installed libtransformer_engine.so.

akoumpa commented 1 month ago

Thank you both for your responses, I think the issue I'm experiencing will be resolved in #760 and as I result I'm closing this one.

NVIDIA / TransformerEngine

Deflect pip queries when querying .so location #855

Description

Type of change

Changes

Checklist: