NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.95k stars 2.48k forks source link

Using nemo LM in other nemo training #5687

Closed bew-pbwt closed 1 year ago

bew-pbwt commented 1 year ago

I want to use our own training nemo LM in other nemo training and I found only this reference https://github.com/NVIDIA/NeMo/blob/4cd9b3449cbfedc671348fbabbe8e3a55fbd659d/examples/nlp/text_classification/conf/ptune_text_classification_config.yaml that have nemo_path in language model config

so I follow that(training text classification) but it doesn't work, it has to have model.tokenizer.tokenizer_name

[NeMo I 2022-12-20 16:02:14 exp_manager:563] TensorboardLogger has been set up Traceback (most recent call last): File "/workspace/data/bew/NeMo/examples/nlp/text_classification/text_classification_with_bert.py", line 117, in main model = TextClassificationModel(cfg.model, trainer=trainer) File "/opt/conda/lib/python3.8/site-packages/nemo/collections/nlp/models/text_classification/text_classification_model.py", line 54, in init self.setup_tokenizer(cfg.tokenizer) File "/opt/conda/lib/python3.8/site-packages/nemo/collections/nlp/models/nlp_model.py", line 112, in setup_tokenizer tokenizer_name=cfg.tokenizer_name, omegaconf.errors.ConfigAttributeError: Key 'tokenizer_name' is not in struct full_key: model.tokenizer.tokenizer_name reference_type=Any object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

If I mention model.tokenizer.tokenizer_name and model.language_model.pretrained_model_name then put model.language_model.nemo_file(path to our custom nemo model), the training look ok but it seems to use the model from huggingface.

Is there a way to using custom LM model?

ericharper commented 1 year ago

Are you able to share the config that you're using? It should be logged in .../nemo_log_globalrank-0_localrank-0.txt ?

bew-pbwt commented 1 year ago

Hi @ericharper

Sorry I just saw the link that I attach to the issue is not showing , now I already edited and here is the nemo_log_globalrank-0_localrank-0.txt

[NeMo W 2022-12-20 16:02:14 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/omegaconf/basecontainer.py:225: UserWarning: cfg.pretty() is deprecated and will be removed in a future version. Use OmegaConf.to_yaml(cfg)

  warnings.warn(

[NeMo I 2022-12-20 16:02:14 text_classification_with_bert:110] Config Params: trainer: gpus: 8 num_nodes: 1 max_epochs: 100 max_steps: null accumulate_grad_batches: 1 gradient_clip_val: 0.0 amp_level: O0 precision: 32 accelerator: ddp log_every_n_steps: 1 val_check_interval: 1.0 resume_from_checkpoint: null num_sanity_val_steps: 0 checkpoint_callback: false logger: false model: nemo_path: text_classification_model.nemo tokenizer: library: megatron type: GPT2BPETokenizer model: null vocab_file: null merge_file: null language_model: nemo_file: /workspace/data/bew/NeMo/qa/all-thai.nemo classifier_head: num_output_layers: 2 fc_dropout: 0.1 class_labels: class_labels_file: null dataset: num_classes: 420 do_lower_case: false max_seq_length: 256 class_balancing: null use_cache: false train_ds: file_path: /workspace/data/bew/NeMo/text_classification/data/th/product-train.tsv batch_size: 64 shuffle: true num_samples: -1 num_workers: 3 drop_last: false pin_memory: false validation_ds: file_path: /workspace/data/bew/NeMo/text_classification/data/th/product-dev.tsv batch_size: 64 shuffle: false num_samples: -1 num_workers: 3 drop_last: false pin_memory: false test_ds: file_path: null batch_size: 64 shuffle: false num_samples: -1 num_workers: 3 drop_last: false pin_memory: false optim: name: adam lr: 5.0e-06 betas:

  • 0.9
  • 0.999 weight_decay: 0.01 sched: name: WarmupAnnealing warmup_steps: null warmup_ratio: 0.1 last_epoch: -1 monitor: val_loss reduce_on_plateau: false infer_samples:
    • by the end of no such thing the audience , like beatrice , has a watchful affection for the monster .
    • director rob marshall went out gunning to make a great one .
    • uneasy mishmash of styles and genres . exp_manager: exp_dir: null name: TextClassification create_tensorboard_logger: true create_checkpoint_callback: true

[NeMo I 2022-12-20 16:02:14 exp_manager:216] Experiments will be logged at /workspace/data/bew/NeMo/text_classification/nemo_experiments/TextClassification/2022-12-20_16-02-14 [NeMo I 2022-12-20 16:02:14 exp_manager:563] TensorboardLogger has been set up

rbriski commented 1 year ago

@Phakkhamat

I'm not sure if I'll be able to help but I've been dealing with a similar issue. Do you mind posting the YAML file so that the indents are clear? Maybe we can figure it out together.

bew-pbwt commented 1 year ago

@rbriski You cannot open the link I put in the post right? I don't know why it cannot open from clicking that link but when copy the address and open in another tab, it can open. I use the same config but path to my data and use 8 GPUs.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 7 days since being marked as stale.