finetuning LLM use docker meet error

(base) ➜  jupyter git:(master) ✗ ./train.sh
PyTorch version 2.0.0+cpu available.
generated new fontManager
/usr/local/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
███████████████████████
█ █ █ █  ▜█ █ █ █ █   █
█ █ █ █ █ █ █ █ █ █ ███
█ █   █ █ █ █ █ █ █ ▌ █
█ █████ █ █ █ █ █ █ █ █
█     █  ▟█     █ █   █
███████████████████████
ludwig v0.10.2 - Experiment

Setting generation max_new_tokens to 1024 to correspond with the max sequence length assigned to the output feature or the global max sequence length. This will ensure that the correct number of tokens are generated at inference time. To override this behavior, set `generation.max_new_tokens` to a different value in your Ludwig config.

╒════════════════════════╕
│ EXPERIMENT DESCRIPTION │
╘════════════════════════╛

╒══════════════════╤══════════════════════════════════════════════════════════════════════════╕
│ Experiment name  │ experiment                                                               │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ Model name       │ run                                                                      │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ Output directory │ /src/results/experiment_run_2                                            │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ ludwig_version   │ '0.10.2'                                                                 │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ command          │ ('/usr/local/bin/ludwig experiment --config /src/config.yaml --dataset ' │
│                  │  '/data/train-00000-of-00001.parquet --output_directory /src/results')   │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ random_seed      │ 42                                                                       │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ dataset          │ '/data/train-00000-of-00001.parquet'                                     │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ data_format      │ 'parquet'                                                                │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ torch_version    │ '2.0.0+cpu'                                                              │
├──────────────────┼──────────────────────────────────────────────────────────────────────────┤
│ compute          │ {'num_nodes': 1}                                                         │
╘══════════════════╧══════════════════════════════════════════════════════════════════════════╛

╒═══════════════╕
│ LUDWIG CONFIG │
╘═══════════════╛

User-specified config (with upgrades):

{   'adapter': {'type': 'lora'},
    'base_model': 'facebook/opt-1.3b',
    'input_features': [{'name': 'prompt', 'type': 'text'}],
    'ludwig_version': '0.10.2',
    'model_type': 'llm',
    'output_features': [{'name': 'response', 'type': 'text'}],
    'preprocessing': {'sample_ratio': 0.1},
    'prompt': {   'template': '### Instruction:\n'
                              '{instruction}\n'
                              '\n'
                              '### Response:\n'},
    'trainer': {   'batch_size': 'auto',
                   'compile': True,
                   'epochs': 3,
                   'gradient_accumulation_steps': 16,
                   'learning_rate': 'auto',
                   'learning_rate_scaling': 'sqrt',
                   'learning_rate_scheduler': {'warmup_fraction': 0.01},
                   'optimizer': {'type': 'adamw'},
                   'type': 'finetune',
                   'use_mixed_precision': True}}

Full config saved to:
/src/results/experiment_run_2/experiment/model/model_hyperparameters.json

╒═══════════════╕
│ PREPROCESSING │
╘═══════════════╛

Found cached dataset and meta.json with the same filename of the dataset, but checksums don't match, if saving of processed input is not skipped they will be overridden
Using full raw dataset, no hdf5 and json file with the same name have been found
Building dataset (it may take a while)
Loaded HuggingFace implementation of facebook/opt-1.3b tokenizer
/usr/local/lib/python3.8/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Max length of feature 'None': 2713 (without start and stop symbols)
Max sequence length is 2713 for feature 'None'
Loaded HuggingFace implementation of facebook/opt-1.3b tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Max length of feature 'response': 3109 (without start and stop symbols)
Max sequence length is 3109 for feature 'response'
Loaded HuggingFace implementation of facebook/opt-1.3b tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Loaded HuggingFace implementation of facebook/opt-1.3b tokenizer
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Building dataset: DONE
Writing preprocessed training set cache to /data/train-00000-of-00001.training.hdf5
Writing preprocessed validation set cache to /data/train-00000-of-00001.validation.hdf5
Writing preprocessed test set cache to /data/train-00000-of-00001.test.hdf5
Writing train set metadata to /data/train-00000-of-00001.meta.json

Dataset Statistics
╒════════════╤═══════════════╤════════════════════╕
│ Dataset    │   Size (Rows) │ Size (In Memory)   │
╞════════════╪═══════════════╪════════════════════╡
│ Training   │          4900 │ 1.12 Mb            │
├────────────┼───────────────┼────────────────────┤
│ Validation │           700 │ 164.19 Kb          │
├────────────┼───────────────┼────────────────────┤
│ Test       │          1400 │ 328.25 Kb          │
╘════════════╧═══════════════╧════════════════════╛

╒═══════╕
│ MODEL │
╘═══════╛

Warnings and other logs:
Loading large language model...
Done.
Loaded HuggingFace implementation of facebook/opt-1.3b tokenizer
==================================================
Trainable Parameter Summary For Fine-Tuning
Fine-tuning with adapter: lora
trainable params: 1,572,864 || all params: 1,317,330,944 || trainable%: 0.11939778740975206
==================================================
/usr/local/lib/python3.8/site-packages/torch/_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
Training with torchdynamo compiled model
`trainer.use_mixed_precision=True`, but no GPU device found. Setting to `False`
Tuning batch size...
Tuning batch size...
Exploring batch_size=1
[2024-11-24 03:19:41,470] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward
[2024-11-24 03:19:41,543] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function debug_wrapper
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[2024-11-24 03:19:45,054] torch._dynamo.output_graph: [INFO] Step 2: done compiler function debug_wrapper
[2024-11-24 03:19:45,845] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing generate_merged_ids
[2024-11-24 03:19:45,878] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing remove_left_padding
[2024-11-24 03:19:45,909] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing add_left_padding
[2024-11-24 03:19:45,933] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo done tracing add_left_padding (RETURN_VALUE)
[2024-11-24 03:19:45,935] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function debug_wrapper
[2024-11-24 03:19:45,963] torch._inductor.compile_fx: [INFO] Step 3: torchinductor compiling FORWARDS graph 1
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[2024-11-24 03:19:47,941] torch._inductor.compile_fx: [INFO] Step 3: torchinductor done compiling FORWARDS graph 1
[2024-11-24 03:19:47,943] torch._dynamo.output_graph: [INFO] Step 2: done compiler function debug_wrapper
[2024-11-24 03:19:47,950] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing create_attention_mask
[2024-11-24 03:19:47,967] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing <graph break in forward>
[2024-11-24 03:19:47,971] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing <graph break in forward>
Successfully loaded model weights from /tmp/tmpvrba4sm0/latest.ckpt.
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/convert_frame.py", line 324, in _compile
    out_code = transform_code_object(code, transform)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/bytecode_transformation.py", line 445, in transform_code_object
    transformations(instructions, code_options)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/convert_frame.py", line 311, in transform
    tracer.run()
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 1726, in run
    super().run()
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 576, in run
    and self.step()
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 540, in step
    getattr(self, inst.opname)(inst)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 342, in wrapper
    return inner_fn(self, inst)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 1014, in CALL_FUNCTION_KW
    self.call_function(fn, args, kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 474, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/variables/nn_module.py", line 244, in call_function
    return tx.inline_user_function_return(
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 510, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 1806, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 1862, in inline_call_
    tracer.run()
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 576, in run
    and self.step()
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 540, in step
    getattr(self, inst.opname)(inst)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/symbolic_convert.py", line 1030, in LOAD_ATTR
    result = BuiltinVariable(getattr).call_function(
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/variables/builtin.py", line 566, in call_function
    result = handler(tx, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/variables/builtin.py", line 930, in call_getattr
    return obj.var_getattr(tx, name).add_options(options)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/variables/nn_module.py", line 124, in var_getattr
    subobj = inspect.getattr_static(base, name)
  File "/usr/local/lib/python3.8/inspect.py", line 1622, in getattr_static
    raise AttributeError(attr)
AttributeError: config

from user code:
   File "/usr/local/lib/python3.8/site-packages/ludwig/models/llm.py", line 276, in <graph break in forward>
    model_outputs = self.model(input_ids=self.model_inputs, attention_mask=self.attention_masks).get(LOGITS)
  File "/usr/local/lib/python3.8/site-packages/peft/peft_model.py", line 1111, in forward
    if self.base_model.config.model_type == "mpt":

Set torch._dynamo.config.verbose=True for more information

You can suppress this exception and fall back to eager by setting:
    torch._dynamo.config.suppress_errors = True

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/ludwig", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/ludwig/cli.py", line 197, in main
    CLI()
  File "/usr/local/lib/python3.8/site-packages/ludwig/cli.py", line 72, in __init__
    getattr(self, args.command)()
  File "/usr/local/lib/python3.8/site-packages/ludwig/cli.py", line 97, in experiment
    experiment.cli(sys.argv[2:])
  File "/usr/local/lib/python3.8/site-packages/ludwig/experiment.py", line 528, in cli
    experiment_cli(**vars(args))
  File "/usr/local/lib/python3.8/site-packages/ludwig/experiment.py", line 217, in experiment_cli
    (eval_stats, train_stats, preprocessed_data, output_directory) = model.experiment(
  File "/usr/local/lib/python3.8/site-packages/ludwig/api.py", line 1539, in experiment
    (train_stats, preprocessed_data, output_directory) = self.train(
  File "/usr/local/lib/python3.8/site-packages/ludwig/api.py", line 655, in train
    self._tune_batch_size(trainer, training_set, random_seed=random_seed)
  File "/usr/local/lib/python3.8/site-packages/ludwig/api.py", line 883, in _tune_batch_size
    tuned_batch_size = trainer.tune_batch_size(
  File "/usr/local/lib/python3.8/site-packages/ludwig/trainers/trainer_llm.py", line 493, in tune_batch_size
    return super().tune_batch_size(
  File "/usr/local/lib/python3.8/site-packages/ludwig/trainers/trainer.py", line 597, in tune_batch_size
    best_batch_size = evaluator.select_best_batch_size(
  File "/usr/local/lib/python3.8/site-packages/ludwig/utils/batch_size_tuner.py", line 57, in select_best_batch_size
    samples_per_sec = self.evaluate(
  File "/usr/local/lib/python3.8/site-packages/ludwig/utils/batch_size_tuner.py", line 108, in evaluate
    self.step(batch_size, global_max_sequence_length=global_max_sequence_length)
  File "/usr/local/lib/python3.8/site-packages/ludwig/utils/batch_size_tuner.py", line 170, in step
    self.perform_step(inputs, targets)
  File "/usr/local/lib/python3.8/site-packages/ludwig/utils/batch_size_tuner.py", line 180, in perform_step
    self.trainer.train_step(inputs, targets)
  File "/usr/local/lib/python3.8/site-packages/ludwig/trainers/trainer.py", line 339, in train_step
    model_outputs = self.dist_model((inputs, targets))
  File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/eval_frame.py", line 82, in forward
    return self.dynamo_ctx(self._orig_mod.forward)(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/eval_frame.py", line 209, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/ludwig/models/llm.py", line 265, in forward
    self.model_inputs, self.attention_masks = generate_merged_ids(
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/eval_frame.py", line 337, in catch_errors
    return callback(frame, cache_size, hooks)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/convert_frame.py", line 404, in _convert_frame
    result = inner_convert(frame, cache_size, hooks)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/convert_frame.py", line 104, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/convert_frame.py", line 262, in _convert_frame_assert
    return _compile(
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/_dynamo/convert_frame.py", line 394, in _compile
    raise InternalTorchDynamoError() from e
torch._dynamo.exc.InternalTorchDynamoError
ludwig-ai / ludwig

finetuning LLM use docker meet error #4042