amazon-science / RefChecker

RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Language Models.
Apache License 2.0
226 stars 17 forks source link

Ask a few questions about RefChecker and webside RefChecker Demo #1

Open Romanzhang2024 opened 6 months ago

Romanzhang2024 commented 6 months ago

1、a command-line interface to run RefChecker in a console:

refchecker-cli extract \ --input_path example/example_in.json \ --output_path example/example_out_triplets.json \ --extractor_name claude2 \ --extractor_max_new_tokens 1000 \ --anthropic_key "example/anthropic_key" #I've already applied the ANTHROPIC_API_KEY on the side"https://app.nightfall.ai/developer-platform/api-keys";

print the following error log,i donot know how to Solve this problem: anthropic.PermissionDeniedError: Error code: 403 - {'error': {'type': 'forbidden', 'message': 'Request not allowed'}} 2、RefChecker Demo I used the webside RefChecker Demo and got the following error:

Traceback (most recent call last): File "/root/anaconda3/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script exec(code, module.dict) File "/root/RefChecker-main/demo/main.py", line 48, in assert os.environ.get('SERPER_API_KEY') AssertionError

In China, I can't access google, and I don't have SERPER_API_KEY. I want to use the domestic search engine or no search engine (similar to refchecker-cli command-line usage, The input json file contains reference content)。

I want to know whether you have any good suggestions (eg:how to modify the python code) to quickly use the webside RefChecker Demo to test some examples.

HuXiangkun commented 6 months ago

Hi,

  1. The value for --anthropic_key is the path to the file that stores the API key;
  2. Please remove --enable_search in the command, then it will not ask you to provide the key.
Romanzhang2024 commented 6 months ago

1、Command-line Parameter:--anthropic_key "example/anthropic_key" is definitely the path to the file that stores the API key。"./example/anthropic_key" is a file that stores the API key that I've already applied on the side"https://app.nightfall.ai/developer-platform/api-keys"; Please look at the corresponding error log above carefully,Thanks。 2、RefChecker Demo:I have removed --enable_search in the command。The new command is “streamlit run demo/main.py --server.port=8588 -- --extractor=claude2 --checker=claude2”。 I got the following error log,i donot know how to Solve this problem(In China, I can't access hugging face):

File "/root/RefChecker-main/demo/main.py", line 77, in init self.model = AutoModelForSequenceClassification.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 526, in from_pretrained config, kwargs = AutoConfig.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1082, in from_pretrained config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/configuration_utils.py", line 644, in get_config_dict config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/configuration_utils.py", line 699, in _get_config_dict resolved_config_file = cached_file( ^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/utils/hub.py", line 429, in cached_file raise EnvironmentError( OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like princeton-nlp/sup-simcse-roberta-large is not the path to a directory containing a file named config.json. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

HuXiangkun commented 6 months ago

I see, for the first issue, I don't know the exact reason why Anthropic banned your API key, does it work in your other applications?

For the second issue, I think this is due to the limited access of Huggingface in China. The demo is using a SimCSE model for source attribution. Maybe you can first try to use a VPN for a quick fix. We will apply Modelscope to support users in China.

Romanzhang2024 commented 6 months ago

First,thanks for your reply! 1、The following description “How to Use an Anthropic Claude API Key” is from www.nightfall.ai To use an Anthropic Claude API key, you must include it in the HTTP header of all requests to the Anthropic Claude API. The header should be named "Authorization" and the value should be "Bearer " followed by your API key.

Now ,I haven't tried other apps yet to use the anthropic_key. 2、 According to "Your webside demo is using a SimCSE model for source attribution. Maybe you can first try to use a VPN for a quick fix. ", I would like to know how to use "SimCSE model for source attribution" and How to use VPN to access hugging face in linux environment? Could you provide further guidance info? Thank you!

when do you apply Modelscope to support users in China?

Romanzhang2024 commented 4 months ago

The following errors and exceptions occur when running RefChecker (Command-line:refchecker-cli),Please help to find out the reason, thank you。 errors and exceptions:

(base) root@Ubuntu18:~/RefChecker-main# refchecker-cli extract --input_path example/example_in.json --output_path example/example_out_triplets.json --extractor_name mixtral request.method: HEAD url: /mistralai/Mixtral-8x7B-Instruct-v0.1/resolve/main/config.json request.body: None self.max_retries: Retry(total=0, connect=None, read=False, redirect=None, status=None) Traceback (most recent call last): File "/root/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py", line 468, in _make_request self._validate_conn(conn) File "/root/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1097, in _validate_conn conn.connect() File "/root/anaconda3/lib/python3.11/site-packages/urllib3/connection.py", line 642, in connect sock_and_verified = _ssl_wrap_socket_and_match_hostname( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/urllib3/connection.py", line 783, in _ssl_wrap_socket_and_match_hostname ssl_sock = ssl_wrapsocket( ^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/urllib3/util/ssl.py", line 471, in ssl_wrap_socket ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, serverhostname) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/urllib3/util/ssl.py", line 515, in _ssl_wrap_socket_impl return ssl_context.wrap_socket(sock, server_hostname=server_hostname) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/ssl.py", line 517, in wrap_socket return self.sslsocket_class._create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/ssl.py", line 1108, in _create self.do_handshake() File "/root/anaconda3/lib/python3.11/ssl.py", line 1379, in do_handshake self._sslobj.do_handshake() ssl.SSLEOFError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py", line 791, in urlopen response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py", line 492, in _make_request raise new_e urllib3.exceptions.SSLError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/root/anaconda3/lib/python3.11/site-packages/requests/adapters.py", line 490, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/urllib3/connectionpool.py", line 845, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/urllib3/util/retry.py", line 515, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ urllib3.exceptions.MaxRetryError: SOCKSHTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /mistralai/Mixtral-8x7B-Instruct-v0.1/resolve/main/config.json (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/anaconda3/bin/refchecker-cli", line 8, in sys.exit(main()) ^^^^^^ File "/root/RefChecker-main/refchecker/cli.py", line 117, in main extract(args) File "/root/RefChecker-main/refchecker/cli.py", line 138, in extract extractor = MixtralExtractor() ^^^^^^^^^^^^^^^^^^ File "/root/RefChecker-main/refchecker/extractor/mixtral_extractor.py", line 90, in init self.llm = LLM( ^^^^ File "/root/anaconda3/lib/python3.11/site-packages/vllm/entrypoints/llm.py", line 105, in init self.llm_engine = LLMEngine.from_engine_args(engine_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 304, in from_engine_args engine_configs = engine_args.create_engine_configs() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 218, in create_engine_configs model_config = ModelConfig(self.model, self.tokenizer, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/vllm/config.py", line 101, in init self.hf_config = get_config(self.model, trust_remote_code, revision) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/vllm/transformers_utils/config.py", line 23, in get_config config = AutoConfig.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1111, in from_pretrained config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/configuration_utils.py", line 633, in get_config_dict config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/configuration_utils.py", line 688, in _get_config_dict resolved_config_file = cached_file( ^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/transformers/utils/hub.py", line 398, in cached_file resolved_file = hf_hub_download( ^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(*args, kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1238, in hf_hub_download metadata = get_hf_file_metadata( ^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(args, kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1631, in get_hf_file_metadata r = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 385, in _request_wrapper response = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 408, in _request_wrapper response = get_session().request(method=method, url=url, params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 67, in send return super().send(request, args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/requests/adapters.py", line 521, in send raise SSLError(e, request=request) requests.exceptions.SSLError: (MaxRetryError("SOCKSHTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /mistralai/Mixtral-8x7B-Instruct-v0.1/resolve/main/config.json (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)')))"), '(Request ID: 79a6ccc2-339a-467e-b730-fcdc18be8724)')

Romanzhang2024 commented 4 months ago

Please help solve the problems above , thank you

HuXiangkun commented 4 months ago

Hi @Romanzhang2024 , this error is due to the access issue of HuggingFace. You can try the mirror site of HF: https://hf-mirror.com/ . Specifically, before running your code, execute the following command if you are using Linux:

export HF_ENDPOINT=https://hf-mirror.com

You can find more details in https://hf-mirror.com/

Romanzhang2024 commented 4 months ago

First,Thanks for your reply. After setting the environment variable of the mirror address("export HF_ENDPOINT=https://hf-mirror.com"), the following new problem occurred during the execution of the program. Please answer it,thanks. new problem: (base) root@Ubuntu18:~/RefChecker-main/etc# refchecker-cli extract --input_path example/example_in.json --output_path example/example_out_triplets.json --extractor_name mistral request.method: HEAD url: /dongyru/Mistral-7B-Claim-Extractor/resolve/main/config.json request.body: None self.max_retries: Retry(total=0, connect=None, read=False, redirect=None, status=None) Traceback (most recent call last): File "/root/anaconda3/bin/refchecker-cli", line 8, in sys.exit(main()) ^^^^^^ File "/root/RefChecker-main/refchecker/cli.py", line 117, in main extract(args) File "/root/RefChecker-main/refchecker/cli.py", line 140, in extract extractor = MistralExtractor() ^^^^^^^^^^^^^^^^^^ File "/root/RefChecker-main/refchecker/extractor/mistral_extractor.py", line 30, in init self.llm = LLM( ^^^^ File "/root/anaconda3/lib/python3.11/site-packages/vllm/entrypoints/llm.py", line 105, in init self.llm_engine = LLMEngine.from_engine_args(engine_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 307, in from_engine_args placement_group = initialize_cluster(parallel_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/lib/python3.11/site-packages/vllm/engine/ray_utils.py", line 87, in initialize_cluster assert parallel_config.world_size == 1, ("Ray is required if parallel_config.world_size > 1.") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError: Ray is required if parallel_config.world_size > 1.

HuXiangkun commented 4 months ago

Hi @Romanzhang2024 , I cannot reproduce this error, could provide more information about your server (e.g. number of GPUs) you are using?

Romanzhang2024 commented 4 months ago

At present, my test server does not have GPUs.Is this problem caused by the test server does not have any GPUs? I will change a test server with some GPUs recently。

Romanzhang2024 commented 4 months ago

At present, my new test server contains 2 GPUs (NVIDIA A800)。 New issues following(checker_name alignscore):

root@test-SYS-420GP-TNR:~/RefChecker# refchecker-cli check --input_path example/example_out_triplets.json --output_path example/example_out.json --checker_name alignscore --aggregator_name soft

/root/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_fields.py:151: UserWarning: Field "model_serverurl" has conflict with protected namespace "model".

You may be able to resolve this warning by setting model_config['protected_namespaces'] = (). warnings.warn( /root/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_config.py:322: UserWarning: Valid config keys have changed in V2: