Retrain previously fine tuned adapter

raghavbj24 commented 9 months ago

Hi, We are trying to perform incremental training...but we are facing the following error

Complete log file->


Read the training data into a dataframe..
Reading the training data into a dataframe has been completed..
Setting up the HuggingFace API Token..
Huggingface token is added in the environment..
Load the Ludwig configuration YAML file..
Loading the Ludwig configuration YAML file has been completed..
Loading the Base Model..
Setting generation max_new_tokens to 512 to correspond with the max sequence length assigned to the output feature or the global max sequence length. This will ensure that the correct number of tokens are generated at inference time. To override this behavior, set `generation.max_new_tokens` to a different value in your Ludwig config.
Loading the trained Base Model has been completed..
Starting the Fine Tuning..

╒════════════════════════╕
│ EXPERIMENT DESCRIPTION │
╘════════════════════════╛

╒══════════════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════╕
│ Experiment name  │ api_experiment                                                                                    │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Model name       │ run                                                                                               │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Output directory │ /home/ubuntu/results/api_experiment_run_19                                                        │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ludwig_version   │ '0.9.3'                                                                                           │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ command          │ '/home/ubuntu/train_llama-2_7b_Log_Analytics_8bit_merged_v8/codebase/train_llama_using_ludwig.py' │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ random_seed      │ 42                                                                                                │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ data_format      │ "<class 'pandas.core.frame.DataFrame'>"                                                           │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ torch_version    │ '2.1.0+cu121'                                                                                     │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ compute          │ {   'arch_list': [   'sm_50',                                                                     │
│                  │                      'sm_60',                                                                     │
│                  │                      'sm_70',                                                                     │
│                  │                      'sm_75',                                                                     │
│                  │                      'sm_80',                                                                     │
│                  │                      'sm_86',                                                                     │
│                  │                      'sm_90'],                                                                    │
│                  │     'devices': {   0: {   'device_capability': (8, 6),                                            │
│                  │                           'device_properties': "_CudaDeviceProperties(name='NVIDIA "              │
│                  │                                                "A10G', major=8, minor=6, "                        │
│                  │                                                'total_memory=22723MB, '                           │
│                  │                                                'multi_processor_count=80)',                       │
│                  │                           'gpu_type': 'NVIDIA A10G'}},                                            │
│                  │     'gencode_flags': '-gencode compute=compute_50,code=sm_50 -gencode '                           │
│                  │                      'compute=compute_60,code=sm_60 -gencode '                                    │
│                  │                      'compute=compute_70,code=sm_70 -gencode '                                    │
│                  │                      'compute=compute_75,code=sm_75 -gencode '                                    │
│                  │                      'compute=compute_80,code=sm_80 -gencode '                                    │
│                  │                      'compute=compute_86,code=sm_86 -gencode '                                    │
│                  │                      'compute=compute_90,code=sm_90',                                             │
│                  │     'gpus_per_node': 1,                                                                           │
│                  │     'num_nodes': 1}                                                                               │
╘══════════════════╧═══════════════════════════════════════════════════════════════════════════════════════════════════╛

╒═══════════════╕
│ LUDWIG CONFIG │
╘═══════════════╛

User-specified config (with upgrades):

{   'adapter': {   'alpha': 16,
                   'bias_type': 'none',
                   'dropout': 0.05,
                   'postprocessor': {   'merge_adapter_into_base_model': True,
                                        'progressbar': True},
                   'pretrained_adapter_weights': None,
                   'r': 8,
                   'target_modules': None,
                   'type': 'lora'},
    'backend': {'type': 'local'},
    'base_model': '/home/ubuntu/results/api_experiment_run_15/model/model_weights',
    'input_features': [   {   'name': 'prompt',
                              'preprocessing': {'max_sequence_length': 1024},
                              'type': 'text'}],
    'ludwig_version': '0.9.3',
    'model_type': 'llm',
    'output_features': [   {   'name': 'Response',
                               'preprocessing': {'max_sequence_length': 512},
                               'type': 'text'}],
    'preprocessing': {'sample_ratio': 1.0},
    'prompt': {   'template': '### Instruction:\n'
                              '{Instruction}\n'
                              '\n'
                              '### Context:\n'
                              '{Context}\n'
                              '\n'
                              '### Response:\n'},
    'quantization': {'bits': 8},
    'trainer': {   'batch_size': 1,
                   'enable_gradient_checkpointing': True,
                   'epochs': 3,
                   'gradient_accumulation_steps': 1,
                   'learning_rate': 0.0001,
                   'learning_rate_scheduler': {'warmup_fraction': 0.01},
                   'max_batch_size': 1,
                   'type': 'finetune'}}

Full config saved to:
/home/ubuntu/results/api_experiment_run_19/api_experiment/model/model_hyperparameters.json

╒═══════════════╕
│ PREPROCESSING │
╘═══════════════╛

No cached dataset found at /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.training.hdf5. Preprocessing the dataset.
Using full dataframe
Building dataset (it may take a while)
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Max length of feature 'None': 143 (without start and stop symbols)
Max sequence length is 143 for feature 'None'
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Max length of feature 'Response': 144 (without start and stop symbols)
Max sequence length is 144 for feature 'Response'
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Building dataset: DONE
Writing preprocessed training set cache to /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.training.hdf5
Writing preprocessed validation set cache to /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.validation.hdf5
Writing preprocessed test set cache to /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.test.hdf5
Writing train set metadata to /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.meta.json

Dataset Statistics
╒════════════╤═══════════════╤════════════════════╕
│ Dataset    │   Size (Rows) │ Size (In Memory)   │
╞════════════╪═══════════════╪════════════════════╡
│ Training   │            31 │ 7.39 Kb            │
├────────────┼───────────────┼────────────────────┤
│ Validation │             4 │ 1.06 Kb            │
├────────────┼───────────────┼────────────────────┤
│ Test       │             9 │ 2.23 Kb            │
╘════════════╧═══════════════╧════════════════════╛

╒═══════╕
│ MODEL │
╘═══════╛

Warnings and other logs:
Loading large language model...
We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards:  50%|█████     | 1/2 [01:15<01:15, 75.45s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [01:42<00:00, 46.69s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [01:42<00:00, 51.01s/it]
Done.
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Already found a `peft_config` attribute in the model. This will lead to having multiple adapters in the model. Make sure to know what you are doing!
==================================================
Trainable Parameter Summary For Fine-Tuning
Fine-tuning with adapter: lora
trainable params: 4,194,304 || all params: 6,742,609,920 || trainable%: 0.06220594176090199
==================================================
Gradient checkpointing enabled for training.

╒══════════╕
│ TRAINING │
╘══════════╛

Creating fresh model training run.
Training for 93 step(s), approximately 3 epoch(s).
Early stopping policy: 5 round(s) of evaluation, or 155 step(s), approximately 5 epoch(s).

Starting with step 0, epoch: 0

Training:   0%|          | 0/93 [00:00<?, ?it/s]/opt/conda/envs/ludwig_train_env/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py:322: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
Unable to complete the finetuning due to error Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

Training:   0%|          | 0/93 [00:00<?, ?it/s]

Can you help us in solving this TIA

raghavbj24 commented 9 months ago

Hi, would it be possible for someone to kindly assist me with this error?

simaotwx commented 9 months ago

I'm trying the same thing but I get an error a lot earlier:

PyTorch version 2.2.0 available.
███████████████████████
█ █ █ █  ▜█ █ █ █ █   █
█ █ █ █ █ █ █ █ █ █ ███
█ █   █ █ █ █ █ █ █ ▌ █
█ █████ █ █ █ █ █ █ █ █
█     █  ▟█     █ █   █
███████████████████████
ludwig v0.9.3 - Train

Traceback (most recent call last):
  File "/home/azureuser/ludwig/venv/bin/ludwig", line 8, in <module>
    sys.exit(main())
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/cli.py", line 197, in main
    CLI()
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/cli.py", line 72, in __init__
    getattr(self, args.command)()
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/cli.py", line 77, in train
    train.cli(sys.argv[2:])
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/train.py", line 395, in cli
    train_cli(**vars(args))
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/train.py", line 176, in train_cli
    model = LudwigModel(
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/api.py", line 317, in __init__
    self.config_obj = ModelConfig.from_dict(self._user_config)
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/schema/model_types/base.py", line 141, in from_dict
    config_obj: ModelConfig = schema.load(config)
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/marshmallow_dataclass/__init__.py", line 730, in load
    return clazz(**all_loaded)
  File "<string>", line 18, in __init__
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/schema/model_types/base.py", line 73, in __post_init__
    set_llm_parameters(self)
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/schema/model_types/utils.py", line 314, in set_llm_parameters
    _set_generation_max_new_tokens(config)
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/schema/model_types/utils.py", line 401, in _set_generation_max_new_tokens
    max_possible_sequence_length = _get_maximum_possible_sequence_length(config, _DEFAULT_MAX_SEQUENCE_LENGTH)
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/ludwig/schema/model_types/utils.py", line 377, in _get_maximum_possible_sequence_length
    model_config = AutoConfig.from_pretrained(config.base_model)
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1100, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/transformers/configuration_utils.py", line 634, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
    resolved_config_file = cached_file(
  File "/home/azureuser/ludwig/venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 356, in cached_file
    raise EnvironmentError(
OSError: /home/azureuser/ludwig/results/experiment_run_50/model/model_weights does not appear to have a file named config.json. Checkout 'https://huggingface.co//home/azureuser/ludwig/results/experiment_run_50/model/model_weights/None' for available files.

How did you get the model to load in the first place?

simaotwx commented 9 months ago

Looks like there is an issue with the generated model itself on my end.

gaurav-kantrod commented 9 months ago

Hi, We are trying to perform incremental training...but we are facing the following error

Complete log file->


Read the training data into a dataframe..
Reading the training data into a dataframe has been completed..
Setting up the HuggingFace API Token..
Huggingface token is added in the environment..
Load the Ludwig configuration YAML file..
Loading the Ludwig configuration YAML file has been completed..
Loading the Base Model..
Setting generation max_new_tokens to 512 to correspond with the max sequence length assigned to the output feature or the global max sequence length. This will ensure that the correct number of tokens are generated at inference time. To override this behavior, set `generation.max_new_tokens` to a different value in your Ludwig config.
Loading the trained Base Model has been completed..
Starting the Fine Tuning..

╒════════════════════════╕
│ EXPERIMENT DESCRIPTION │
╘════════════════════════╛

╒══════════════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════╕
│ Experiment name  │ api_experiment                                                                                    │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Model name       │ run                                                                                               │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Output directory │ /home/ubuntu/results/api_experiment_run_19                                                        │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ludwig_version   │ '0.9.3'                                                                                           │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ command          │ '/home/ubuntu/train_llama-2_7b_Log_Analytics_8bit_merged_v8/codebase/train_llama_using_ludwig.py' │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ random_seed      │ 42                                                                                                │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ data_format      │ "<class 'pandas.core.frame.DataFrame'>"                                                           │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ torch_version    │ '2.1.0+cu121'                                                                                     │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ compute          │ {   'arch_list': [   'sm_50',                                                                     │
│                  │                      'sm_60',                                                                     │
│                  │                      'sm_70',                                                                     │
│                  │                      'sm_75',                                                                     │
│                  │                      'sm_80',                                                                     │
│                  │                      'sm_86',                                                                     │
│                  │                      'sm_90'],                                                                    │
│                  │     'devices': {   0: {   'device_capability': (8, 6),                                            │
│                  │                           'device_properties': "_CudaDeviceProperties(name='NVIDIA "              │
│                  │                                                "A10G', major=8, minor=6, "                        │
│                  │                                                'total_memory=22723MB, '                           │
│                  │                                                'multi_processor_count=80)',                       │
│                  │                           'gpu_type': 'NVIDIA A10G'}},                                            │
│                  │     'gencode_flags': '-gencode compute=compute_50,code=sm_50 -gencode '                           │
│                  │                      'compute=compute_60,code=sm_60 -gencode '                                    │
│                  │                      'compute=compute_70,code=sm_70 -gencode '                                    │
│                  │                      'compute=compute_75,code=sm_75 -gencode '                                    │
│                  │                      'compute=compute_80,code=sm_80 -gencode '                                    │
│                  │                      'compute=compute_86,code=sm_86 -gencode '                                    │
│                  │                      'compute=compute_90,code=sm_90',                                             │
│                  │     'gpus_per_node': 1,                                                                           │
│                  │     'num_nodes': 1}                                                                               │
╘══════════════════╧═══════════════════════════════════════════════════════════════════════════════════════════════════╛

╒═══════════════╕
│ LUDWIG CONFIG │
╘═══════════════╛

User-specified config (with upgrades):

{   'adapter': {   'alpha': 16,
                   'bias_type': 'none',
                   'dropout': 0.05,
                   'postprocessor': {   'merge_adapter_into_base_model': True,
                                        'progressbar': True},
                   'pretrained_adapter_weights': None,
                   'r': 8,
                   'target_modules': None,
                   'type': 'lora'},
    'backend': {'type': 'local'},
    'base_model': '/home/ubuntu/results/api_experiment_run_15/model/model_weights',
    'input_features': [   {   'name': 'prompt',
                              'preprocessing': {'max_sequence_length': 1024},
                              'type': 'text'}],
    'ludwig_version': '0.9.3',
    'model_type': 'llm',
    'output_features': [   {   'name': 'Response',
                               'preprocessing': {'max_sequence_length': 512},
                               'type': 'text'}],
    'preprocessing': {'sample_ratio': 1.0},
    'prompt': {   'template': '### Instruction:\n'
                              '{Instruction}\n'
                              '\n'
                              '### Context:\n'
                              '{Context}\n'
                              '\n'
                              '### Response:\n'},
    'quantization': {'bits': 8},
    'trainer': {   'batch_size': 1,
                   'enable_gradient_checkpointing': True,
                   'epochs': 3,
                   'gradient_accumulation_steps': 1,
                   'learning_rate': 0.0001,
                   'learning_rate_scheduler': {'warmup_fraction': 0.01},
                   'max_batch_size': 1,
                   'type': 'finetune'}}

Full config saved to:
/home/ubuntu/results/api_experiment_run_19/api_experiment/model/model_hyperparameters.json

╒═══════════════╕
│ PREPROCESSING │
╘═══════════════╛

No cached dataset found at /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.training.hdf5. Preprocessing the dataset.
Using full dataframe
Building dataset (it may take a while)
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Max length of feature 'None': 143 (without start and stop symbols)
Max sequence length is 143 for feature 'None'
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Max length of feature 'Response': 144 (without start and stop symbols)
Max sequence length is 144 for feature 'Response'
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Building dataset: DONE
Writing preprocessed training set cache to /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.training.hdf5
Writing preprocessed validation set cache to /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.validation.hdf5
Writing preprocessed test set cache to /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.test.hdf5
Writing train set metadata to /home/ubuntu/eeff4f02cfeb11ee808f12abaaebd043.meta.json

Dataset Statistics
╒════════════╤═══════════════╤════════════════════╕
│ Dataset    │   Size (Rows) │ Size (In Memory)   │
╞════════════╪═══════════════╪════════════════════╡
│ Training   │            31 │ 7.39 Kb            │
├────────────┼───────────────┼────────────────────┤
│ Validation │             4 │ 1.06 Kb            │
├────────────┼───────────────┼────────────────────┤
│ Test       │             9 │ 2.23 Kb            │
╘════════════╧═══════════════╧════════════════════╛

╒═══════╕
│ MODEL │
╘═══════╛

Warnings and other logs:
Loading large language model...
We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards:  50%|█████     | 1/2 [01:15<01:15, 75.45s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [01:42<00:00, 46.69s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [01:42<00:00, 51.01s/it]
Done.
Loaded HuggingFace implementation of /home/ubuntu/results/api_experiment_run_15/model/model_weights tokenizer
Already found a `peft_config` attribute in the model. This will lead to having multiple adapters in the model. Make sure to know what you are doing!
==================================================
Trainable Parameter Summary For Fine-Tuning
Fine-tuning with adapter: lora
trainable params: 4,194,304 || all params: 6,742,609,920 || trainable%: 0.06220594176090199
==================================================
Gradient checkpointing enabled for training.

╒══════════╕
│ TRAINING │
╘══════════╛

Creating fresh model training run.
Training for 93 step(s), approximately 3 epoch(s).
Early stopping policy: 5 round(s) of evaluation, or 155 step(s), approximately 5 epoch(s).

Starting with step 0, epoch: 0

Training:   0%|          | 0/93 [00:00<?, ?it/s]/opt/conda/envs/ludwig_train_env/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py:322: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
Unable to complete the finetuning due to error Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

Training:   0%|          | 0/93 [00:00<?, ?it/s]

Can you help us in solving this TIA

Hi @all, I am also facing the same issue. Did anyone resolve it?

Thanks

ludwig-ai / ludwig

Retrain previously fine tuned adapter #3932