Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
https://otter-ntu.github.io/
MIT License
3.51k stars 239 forks source link

AttributeError: module transformers has no attribute TFOtterForConditionalGeneration #185

Open wuwu-C opened 1 year ago

wuwu-C commented 1 year ago

transformers = 4.28.0

For some reason, I couldn't access huggingface, so I use offline mode model = OtterForConditionalGeneration.from_pretrained("/my/file/path/config.json", device_map="auto", from_tf=True)

But the following error was encountered

The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function. Traceback (most recent call last): File "/home/user4/cww/Otter-main/pipeline/demo/otter_image.py", line 102, in model = OtterForConditionalGeneration.from_pretrained("/data/user4/CWW_OTTER/config.json", device_map="auto", File "/data/anaconda3/envs/otter/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2761, in from_pretrained model, loading_info = load_tf2_checkpoint_in_pytorch_model( File "/data/anaconda3/envs/otter/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 407, in load_tf2_checkpoint_in_pytorch_model tf_model_class = getattr(transformers, tf_model_class_name) File "/data/anaconda3/envs/otter/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1139, in getattr raise AttributeError(f"module {self.name} has no attribute {name}") AttributeError: module transformers has no attribute TFOtterForConditionalGeneration

Luodian commented 1 year ago

May I know if you are using tensorflow to load Otter?

wuwu-C commented 1 year ago

I did not load the model correctly before so I set from_tf=True based on the error message The model be loaded correctly now (I have removed from_tf=True), but after the following error, the program does not terminate, but there is no new information

2023-07-03 15:25:56.933708: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2023-07-03 15:25:56.977043: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-07-03 15:25:57.580896: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Successfully imported xformers version 0.0.20 Using pad_token, but it is not set yet. None The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function. Loading checkpoint shards: 100%|????????????????????| 4/4 [02:28<00:00, 37.07s/it] Some weights of the model checkpoint at /data/user4/CWW_OTTER/ were not used when initializing OtterForConditionalGeneration: ['perceiver.frame_embs']

Luodian commented 1 year ago

Sorry, the error seems strange to me as well.

This is an expected output (but you should notice you will encounter the perceiver not initialized issue because you load the model from a default image version config). The correct way to load model from pretrained is model = OtterForConditionalGeneration.from_pretrained("luodian/OTTER-9B-LA-InContext", device_map="auto")

Successfully imported xformers version 0.0.20
Using pad_token, but it is not set yet.
None
The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.
Loading checkpoint shards: 100%|????????????????????| 4/4 [02:28<00:00, 37.07s/it]
Some weights of the model checkpoint at /data/user4/CWW_OTTER/ were not used when initializing OtterForConditionalGeneration: ['perceiver.frame_embs']

But other Tensorflow and TensorRT related issues are strange to me.