NVIDIA / Megatron-LM

Ongoing research training transformer models at scale
https://docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start
Other
10.24k stars 2.3k forks source link

[BUG] ModuleNotFoundError: No module named 'megatron.training.tokenizer'; 'megatron.training' is not a package #763

Open hellangleZ opened 6 months ago

hellangleZ commented 6 months ago

Describe the bug A clear and concise description of what the bug is.

Stonge issue

/aml2/ds) root@A100:/aml2/Megatron-LM# from megatron.training.tokenizer import build_tokenizer from: can't read /var/mail/megatron.training.tokenizer (/aml2/ds) root@A100:/aml2/Megatron-LM# python tools/preprocess_data.py \

   --input /aml2/traindata/oscar-1GB.jsonl \
   --output-prefix /aml2/traindata\
   --tokenizer-type Llama2Tokenizer \
   --tokenizer-model /aml2/llama2/tokenizer.model \
   --workers 16 \
   --append-eod

[2024-04-02 08:03:42,280] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) Traceback (most recent call last): File "/aml2/Megatron-LM/tools/preprocess_data.py", line 23, in from megatron.training.tokenizer import build_tokenizer ModuleNotFoundError: No module named 'megatron.training.tokenizer'; 'megatron.training' is not a package

To Reproduce Steps to reproduce the behavior. The easier it is to reproduce the faster it will get maintainer attention.

Expected behavior A clear and concise description of what you expected to happen.

Stack trace/logs If applicable, add the stack trace or logs from the time of the error.

Environment (please complete the following information):

Proposed fix If you have a proposal for how to fix the issue state it here or link to a PR.

Additional context Add any other context about the problem here.

philipp-fischer commented 6 months ago

Have you tried again with the most recent version of main? There was a fix regarding this.

github-actions[bot] commented 4 months ago

Marking as stale. No activity in 60 days.

zhaoyang-star commented 3 months ago

I used the latest release version v0.7.0 and the error still happend.

github-actions[bot] commented 1 month ago

Marking as stale. No activity in 60 days.