Closed bew-pbwt closed 1 year ago
Are you able to share the config that you're using? It should be logged in .../nemo_log_globalrank-0_localrank-0.txt
?
Hi @ericharper
Sorry I just saw the link that I attach to the issue is not showing , now I already edited and here is the nemo_log_globalrank-0_localrank-0.txt
[NeMo W 2022-12-20 16:02:14 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/omegaconf/basecontainer.py:225: UserWarning: cfg.pretty() is deprecated and will be removed in a future version. Use OmegaConf.to_yaml(cfg)
warnings.warn(
[NeMo I 2022-12-20 16:02:14 text_classification_with_bert:110] Config Params: trainer: gpus: 8 num_nodes: 1 max_epochs: 100 max_steps: null accumulate_grad_batches: 1 gradient_clip_val: 0.0 amp_level: O0 precision: 32 accelerator: ddp log_every_n_steps: 1 val_check_interval: 1.0 resume_from_checkpoint: null num_sanity_val_steps: 0 checkpoint_callback: false logger: false model: nemo_path: text_classification_model.nemo tokenizer: library: megatron type: GPT2BPETokenizer model: null vocab_file: null merge_file: null language_model: nemo_file: /workspace/data/bew/NeMo/qa/all-thai.nemo classifier_head: num_output_layers: 2 fc_dropout: 0.1 class_labels: class_labels_file: null dataset: num_classes: 420 do_lower_case: false max_seq_length: 256 class_balancing: null use_cache: false train_ds: file_path: /workspace/data/bew/NeMo/text_classification/data/th/product-train.tsv batch_size: 64 shuffle: true num_samples: -1 num_workers: 3 drop_last: false pin_memory: false validation_ds: file_path: /workspace/data/bew/NeMo/text_classification/data/th/product-dev.tsv batch_size: 64 shuffle: false num_samples: -1 num_workers: 3 drop_last: false pin_memory: false test_ds: file_path: null batch_size: 64 shuffle: false num_samples: -1 num_workers: 3 drop_last: false pin_memory: false optim: name: adam lr: 5.0e-06 betas:
- 0.9
- 0.999 weight_decay: 0.01 sched: name: WarmupAnnealing warmup_steps: null warmup_ratio: 0.1 last_epoch: -1 monitor: val_loss reduce_on_plateau: false infer_samples:
- by the end of no such thing the audience , like beatrice , has a watchful affection for the monster .
- director rob marshall went out gunning to make a great one .
- uneasy mishmash of styles and genres . exp_manager: exp_dir: null name: TextClassification create_tensorboard_logger: true create_checkpoint_callback: true
[NeMo I 2022-12-20 16:02:14 exp_manager:216] Experiments will be logged at /workspace/data/bew/NeMo/text_classification/nemo_experiments/TextClassification/2022-12-20_16-02-14 [NeMo I 2022-12-20 16:02:14 exp_manager:563] TensorboardLogger has been set up
@Phakkhamat
I'm not sure if I'll be able to help but I've been dealing with a similar issue. Do you mind posting the YAML file so that the indents are clear? Maybe we can figure it out together.
@rbriski You cannot open the link I put in the post right? I don't know why it cannot open from clicking that link but when copy the address and open in another tab, it can open. I use the same config but path to my data and use 8 GPUs.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.
I want to use our own training nemo LM in other nemo training and I found only this reference https://github.com/NVIDIA/NeMo/blob/4cd9b3449cbfedc671348fbabbe8e3a55fbd659d/examples/nlp/text_classification/conf/ptune_text_classification_config.yaml that have nemo_path in language model config
so I follow that(training text classification) but it doesn't work, it has to have model.tokenizer.tokenizer_name
If I mention model.tokenizer.tokenizer_name and model.language_model.pretrained_model_name then put model.language_model.nemo_file(path to our custom nemo model), the training look ok but it seems to use the model from huggingface.
Is there a way to using custom LM model?