[Question]how to do finetuning ? how to change the train_ds file according to your data ?

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html

Apache License 2.0

11.84k stars 2.46k forks source link

[Question]how to do finetuning ? how to change the train_ds file according to your data ? #1254

Closed samabdullah closed 3 years ago

samabdullah commented 4 years ago

Describe your question

A clear and concise description of your question. Describe what you want to achieve. And/or what NeMo APIs are unclear/confusing.

Environment overview (please complete the following information)

Environment location: [Bare-metal, Docker, Cloud(specify cloud provider - AWS, Azure, GCP, Collab)]
Method of NeMo install: [pip install or from source]. Please specify exact commands you used to install.
If method of install is [Docker], provide docker pull & docker run commands used

Environment details

If NVIDIA docker image is used you don't need to specify these. Otherwise, please provide:

OS version
PyTorch version
Python version

Additional context

Add any other context about the problem here. Example: GPU model

okuchaiev commented 4 years ago

See https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/asr/01_ASR_with_NeMo.ipynb#scrollTo=2f142kIQc1Z2 for ASR example. Also, these two notebooks should be helpful:

samabdullah commented 4 years ago

See https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/asr/01_ASR_with_NeMo.ipynb#scrollTo=2f142kIQc1Z2 for ASR example. Also, these two notebooks should be helpful:
* https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/00_NeMo_Primer.ipynb

* https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/01_NeMo_Models.ipynb

im just confused if i want to fine tune where this train_ds file should be placed as mentioned in tutorial

# Point to the data we'll use for fine-tuning as the training set quartznet.setup_training_data(train_data_config=params['model']['train_ds'])__

if i give its path in the config file wont it overwrite the pretrained model training?

samabdullah commented 4 years ago

any update ?