[BUG] efficient_finetuning_basic.ipynb

Bug Report Checklist

from autogluon.multimodal import MultiModalPredictor
import uuid

train_en_df_downsample = train_en_df.sample(200, random_state=123)

new_model_path = f"./tmp/{uuid.uuid4().hex}-multilingual_ia3_gradient_checkpoint"
predictor = MultiModalPredictor(label="label",
                                path=new_model_path)
predictor.fit(train_en_df_downsample,
              presets="multilingual",
              hyperparameters={
                  "model.hf_text.checkpoint_name": "google/flan-t5-xl",
                  "model.hf_text.gradient_checkpointing": True,
                  "model.hf_text.low_cpu_mem_usage": True,
                  "optimization.efficient_finetune": "ia3_bias",
                  "optimization.lr_decay": 0.9,
                  "optimization.learning_rate": 3e-03,
                  "optimization.end_lr": 3e-03,
                  "optimization.max_epochs": 1,
                  "optimization.warmup_steps": 0,
                  "env.batch_size": 1,
                  "env.eval_batch_size_ratio": 1
              })

Describe the bug ImportError: /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK3c107SymBool10guard_boolEPKcl

Expected behavior Successfully completed model training.

To Reproduce Use the autogluon 0.7.1/0.7.0/0.6.2/0.6.0 follow the url: https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/efficient_finetuning_basic.html#training-flan-t5-xl-on-single-gpu

Screenshots / Logs

/usr/local/lib/python3.8/dist-packages/apex/normalization/fused_layer_norm.py:364 in init │ │ │ │ 361 │ │ super().init() │ │ 362 │ │ │ │ 363 │ │ global fused_layer_norm_cuda │ │ ❱ 364 │ │ fused_layer_norm_cuda = importlib.import_module("fused_layer_norm_cuda") │ │ 365 │ │ │ │ 366 │ │ if isinstance(normalized_shape, numbers.Integral): │ │ 367 │ │ │ normalized_shape = (normalized_shape,) │ │ │ │ /usr/lib/python3.8/importlib/init.py:127 in import_module │ │ │ │ 124 │ │ │ if character != '.': │ │ 125 │ │ │ │ break │ │ 126 │ │ │ level += 1 │ │ ❱ 127 │ return _bootstrap._gcd_import(name[level:], package, level) │ │ 128 │ │ 129 │ │ 130 _RELOADING = {} │ │ in _gcd_import:1014 │ │ in _find_and_load:991 │ │ in _find_and_load_unlocked:975 │ │ in _load_unlocked:657 │ │ in module_from_spec:556 │ │ in create_module:1166 │ │ in _call_with_frames_removed:219 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ ImportError: /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK3c107SymBool10guard_boolEPKcl

Installed Versions

NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 and NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.8 and ``` # Reproduce version autogluon 0.7.1 autogluon 0.7.0 autogluon 0.6.2 autogluon 0.6.0 ``` ```python # Replace this code with the output of the following: from autogluon.core.utils import show_versions show_versions() ``` INSTALLED VERSIONS ------------------ date : 2023-05-19 time : 08:08:05.232368 python : 3.8.10.final.0 OS : Linux OS-release : 5.15.0-58-generic Version : #64-Ubuntu SMP Thu Jan 5 11:43:13 UTC 2023 machine : x86_64 processor : x86_64 num_cores : 40 cpu_ram_mb : 155422 cuda version : None num_gpus : 0 gpu_ram_mb : [] avail_disk_size_mb : 10737404153 accelerate : 0.13.2 albumentations : 1.1.0 autogluon.common : 0.6.0b20230519 autogluon.core : 0.6.0b20230519 autogluon.features : 0.6.0b20230519 autogluon.multimodal : 0.6.0b20230519 autogluon.tabular : None autogluon.text : 0.6.0b20230519 autogluon.timeseries : None autogluon.vision : None boto3 : 1.26.136 dask : 2021.11.2 defusedxml : 0.7.1 distributed : 2021.11.2 evaluate : 0.2.2 fairscale : 0.4.6 hyperopt : 0.2.7 jsonschema : 4.8.0 matplotlib : 3.6.3 nlpaug : 1.1.10 nltk : 3.8.1 nptyping : 1.4.4 numpy : 1.22.2 omegaconf : 2.1.2 openmim : None pandas : 1.5.2 PIL : 9.0.1 psutil : 5.8.0 pycocotools : None pytorch-metric-learning: None pytorch_lightning : 1.7.7 ray : 2.0.1 requests : 2.28.2 scipy : 1.8.1 sentencepiece : 0.1.99 seqeval : None skimage : 0.19.3 sklearn : 1.1.1 smart_open : 5.2.1 text-unidecode : None timm : 0.6.13 torch : 1.12.1+cu102 torchmetrics : 0.8.2 torchtext : None torchvision : 0.13.1+cu102 tqdm : 4.65.0 transformers : 4.23.1 and # ls -la /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so -rwxr-xr-x 1 root staff 6145584 Apr 15 00:34 /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so

autogluon / autogluon

[BUG] efficient_finetuning_basic.ipynb #3223