microsoft / tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
694 stars 84 forks source link

numpy not in requirements #211

Closed 152334H closed 1 year ago

152334H commented 1 year ago

after following the instructions to build from source, i noticed that numpy is needed for the megablocks example:

$ python3 -m tutel.examples.helloworld --megablocks_size=1 --batch_size=1 --num_tokens=32 --top=1 --eval --num_local_experts=128 --capacity_factor=0
/mnt/localdisk/tutel/tutel/impls/jit_compiler.py:19: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
  tutel_custom_kernel.update_sdk_home(torch.tensor([ord(x) for x in SDK_HOME] + [0], dtype=torch.int8, device='cpu'))
CRITICAL:root:Registering device global rank 0: data_rank = 0, model_rank = 0
[Statistics] param count for MoE local_experts = 1074266112, param count for MoE gate = 262144.

ExampleModel(
  (_moe_layer): MOELayer(
    Top-K(s) = ['k=1, noise=0.0'], Total-Experts = 128 [managed by 1 device(s)],
    (experts): FusedExpertsNetwork(model_dim=2048, hidden_size=2048, output_dim=2048, local_experts=128)
    (gates): ModuleList(
      (0): LinearTopKGate(
        (wg): Linear(in_features=2048, out_features=128, bias=False)
      )
    )
  )
)
Traceback (most recent call last):
  File "/home/ubuntu/micromamba/envs/tutel/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ubuntu/micromamba/envs/tutel/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/mnt/localdisk/tutel/tutel/examples/helloworld.py", line 108, in <module>
    x = torch.tensor(torch.randn([batch_size, num_tokens, model_dim], dtype=torch.float32, device='cpu').detach().numpy(), dtype=torch.get_default_dtype(), requires_grad=False, device=device)
RuntimeError: Numpy is not available

but is not installed by setup.py because it is not in install_requires

recommendation: add numpy there, also maybe add ninja as an optional for building (not sure how this could be done in practice)

152334H commented 1 year ago

also deepspeed + mpi4py, for the deepspeed related examples

ghostplant commented 1 year ago

Thanks, I'll refine the setup.py. Is it now working if you manually install numpy? It is a little weird since you have torch installed but numpy is not there.

152334H commented 1 year ago

yeah I thought it would've came with torch too, but it did not (2.0.1+cu118).

it works fine with numpy (and others) manually installed, just wanted to make a public note for anyone else using it

ghostplant commented 1 year ago

Sure, you may have a same question about why torch doesn't install numpy dependency as well. Now the main branch has numpy dependency added. Thanks for your information!

152334H commented 1 year ago

Thanks!