Got error 'The model 'OptimizedModule' is not supported for' when asking a question

pktCoder commented 1 year ago

H2oGPT looks very interesting, especially to a beginner like me. I hope to use it for telecommunication where it digests documents and we can quickly find answers (and reference in the document).

Here is my attempt to run it

$ python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3 --hf_embedding_model=sentence-transformers/all-MiniLM-L6-v2 --score_model=None --load_4bit=True --langchain_mode='UserData'
Using Model h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3
Prep: persist_directory=db_dir_UserData does not exist, regenerating
Did not generate db since no sources
Starting get_model: h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3 
/home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/configuration_utils.py:483: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
/home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1714: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
device_map: {'': 0}
/home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/modeling_utils.py:2193: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
bin /home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
Loading checkpoint shards: 100%|████████████████████████████████| 2/2 [00:11<00:00,  5.62s/it]
Model {'base_model': 'h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3', 'tokenizer_base_model': '', 'lora_weights': '', 'inference_server': '', 'prompt_type': 'prompt_answer', 'prompt_dict': {'promptA': '', 'promptB': '', 'PreInstruct': '<|prompt|>', 'PreInput': None, 'PreResponse': '<|answer|>', 'terminate_response': ['<|prompt|>', '<|answer|>', '<|endoftext|>'], 'chat_sep': '<|endoftext|>', 'chat_turn_sep': '<|endoftext|>', 'humanstr': '<|prompt|>', 'botstr': '<|answer|>', 'generates_leading_space': False}}
Running on local URL:  http://0.0.0.0:7860

On the web interface, I entered a general question "where to download LLM models" as a test, but got error

To create a public link, set `share=True` in `launch()`.
The model 'OptimizedModule' is not supported for . Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM', 'RWForCausalLM'].
Did not generate db since no sources

Any idea what may be wrong?

Thanks in advance!

pseudotensor commented 1 year ago

It's not an error, just a warning, mentioned in FAQ. The Did not generate db since no sources is the only relevant thing. Did you upload a pdf?

pktCoder commented 1 year ago

Thanks @pseudotensor for looking into it, Now I tried it again and got the following error. I guess it means my RTX 3070 doesn't support 4 bit operation?

$ conda activate h2ogpt
$ cd pkgs/h2ogpt/
$ python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3 --hf_embedding_model=sentence-transformers/all-MiniLM-L6-v2 --score_model=None --load_4bit=True --langchain_mode='UserData'
Using Model h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3
Prep: persist_directory=db_dir_UserData does not exist, regenerating
Did not generate db since no sources
Starting get_model: h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3 
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
device_map: {'': 0}
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Traceback (most recent call last):
  File "/home/ji03/pkgs/h2ogpt/generate.py", line 16, in <module>
    entrypoint_main()
  File "/home/ji03/pkgs/h2ogpt/generate.py", line 12, in entrypoint_main
    fire.Fire(main)
  File "/home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/ji03/pkgs/h2ogpt/src/gen.py", line 730, in main
    model0, tokenizer0, device = get_model(reward_type=False,
  File "/home/ji03/pkgs/h2ogpt/src/gen.py", line 1074, in get_model
    return get_hf_model(load_8bit=load_8bit,
  File "/home/ji03/pkgs/h2ogpt/src/gen.py", line 1197, in get_hf_model
    model = get_non_lora_model(base_model, model_loader, load_half, load_gptq, use_safetensors,
  File "/home/ji03/pkgs/h2ogpt/src/gen.py", line 894, in get_non_lora_model
    model = model_loader(
  File "/home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 466, in from_pretrained
    return model_class.from_pretrained(
  File "/home/ji03/anaconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2629, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
TypeError: RWForCausalLM.__init__() got an unexpected keyword argument 'load_in_4bit'

pseudotensor commented 1 year ago

I think it means your transformers is too old. Check with pip freeze | grep transformers. Let me know.

pktCoder commented 1 year ago

Thanks pseudotensor for pointing it out. My transformers has version 4.28.1, after upgrading it to 4.31.0, it started to work.

h2oai / h2ogpt

Got error 'The model 'OptimizedModule' is not supported for' when asking a question #531