NVIDIA / Megatron-LM

Ongoing research training transformer models at scale
https://docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start
Other
9.43k stars 2.12k forks source link

[BUG] Must have transformer_engine? #713

Open zhangsheng377 opened 4 months ago

zhangsheng377 commented 4 months ago

When I use megatron but don't have transformer_engine, it will report error:

  File "/home/z00454081/Megatron-LM-main/megatron/__init__.py", line 16, in <module>
    from .initialize  import initialize_megatron
  File "/home/z00454081/Megatron-LM-main/megatron/initialize.py", line 18, in <module>
    from megatron.arguments import parse_args, validate_args
  File "/home/z00454081/Megatron-LM-main/megatron/arguments.py", line 16, in <module>
    from megatron.core.models.retro import RetroConfig
  File "/home/z00454081/Megatron-LM-main/megatron/core/models/retro/__init__.py", line 4, in <module>
    from .decoder_spec import get_retro_decoder_block_spec
  File "/home/z00454081/Megatron-LM-main/megatron/core/models/retro/decoder_spec.py", line 5, in <module>
    from megatron.core.models.gpt.gpt_layer_specs import (
  File "/home/z00454081/Megatron-LM-main/megatron/core/models/gpt/__init__.py", line 1, in <module>
    from .gpt_model import GPTModel
  File "/home/z00454081/Megatron-LM-main/megatron/core/models/gpt/gpt_model.py", line 17, in <module>
    from megatron.core.transformer.transformer_block import TransformerBlock
  File "/home/z00454081/Megatron-LM-main/megatron/core/transformer/transformer_block.py", line 16, in <module>
    from megatron.core.transformer.custom_layers.transformer_engine import (
  File "/home/z00454081/Megatron-LM-main/megatron/core/transformer/custom_layers/transformer_engine.py", line 7, in <module>
    import transformer_engine as te
ModuleNotFoundError: No module named 'transformer_engine'

This is a bug I think, because I don't want to use transformer_engine, and when I remove the RetroConfig code, I can use megatron normally.

wheynelau commented 4 months ago

This happens when trying to run compute_memory_usage.py as well

github-actions[bot] commented 2 months ago

Marking as stale. No activity in 60 days.