In NeMo-Run we offer to configure an experiment locally but execute it remotely through a wide variety of different executors. For the best UX we require mcore to be able to be installed locally (eventhough we don't intend to train models locally). Currently there are some issues to make a pip-install fail.
The C++ extension, I propose to log a warning when this fails but don't fail the entire installation.
The jit_fuser can fail on newer python installs. I propose to make it a no-op when it fails. Current error is:
from megatron.core.transformer.utils import (
File "/Users/mromeijn/base/code/.venv/lib/python3.12/site-packages/megatron/core/transformer/utils.py", line 40, in <module>
@jit_fuser
^^^^^^^^^
File "/Users/mromeijn/base/code/.venv/lib/python3.12/site-packages/lightning_fabric/wrappers.py", line 411, in _capture
return compile_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/mromeijn/base/code/.venv/lib/python3.12/site-packages/torch/__init__.py", line 1868, in compile
raise RuntimeError("Dynamo is not supported on Python 3.12+")
RuntimeError: Dynamo is not supported on Python 3.12+
In NeMo-Run we offer to configure an experiment locally but execute it remotely through a wide variety of different executors. For the best UX we require mcore to be able to be installed locally (eventhough we don't intend to train models locally). Currently there are some issues to make a pip-install fail.
jit_fuser
can fail on newer python installs. I propose to make it a no-op when it fails. Current error is: