I tested OpenAssistant/reward-model-deberta-v3-large-v2 model. Despite the mode having TextClassification type, related datasets do not have the structure of the 'classification' dataset. Thus, during feature mapping (_get_feature_mapping method) stage next errors are happened, depending on the dataset:
/Users/mykytaalekseiev/Work/GiskardPipVersion/venv/bin/python /Users/mykytaalekseiev/Work/cicd/cli.py --loader huggingface --model OpenAssistant/reward-model-deberta-v3-large-v2 --dataset openai/summarize_from_feedback --dataset_split train --dataset_config comparisons --output ${model_name}__default_scan_with__${dataset_name}.html
Traceback (most recent call last):
File "/Users/mykytaalekseiev/Work/cicd/cli.py", line 43, in <module>
report = runner.run(**runner_kwargs)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/pipeline/runner.py", line 35, in run
gsk_model, gsk_dataset = loader.load_giskard_model_dataset(**kwargs)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/loaders/huggingface_loader.py", line 53, in load_giskard_model_dataset
feature_mapping = self._get_feature_mapping(hf_model, hf_dataset)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/loaders/huggingface_loader.py", line 128, in _get_feature_mapping
raise RuntimeError(msg)
RuntimeError: Could not find a suitable mapping for feature for `label`.
openai/webgpt_comparisons
Traceback (most recent call last):
File "/Users/mykytaalekseiev/Work/cicd/cli.py", line 43, in <module>
report = runner.run(**runner_kwargs)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/pipeline/runner.py", line 35, in run
gsk_model, gsk_dataset = loader.load_giskard_model_dataset(**kwargs)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/loaders/huggingface_loader.py", line 53, in load_giskard_model_dataset
feature_mapping = self._get_feature_mapping(hf_model, hf_dataset)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/loaders/huggingface_loader.py", line 123, in _get_feature_mapping
candidates = [f for f in available_features if dataset_features[f].dtype == expected_type]
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/loaders/huggingface_loader.py", line 123, in <listcomp>
candidates = [f for f in available_features if dataset_features[f].dtype == expected_type]
AttributeError: 'dict' object has no attribute 'dtype'
Anthropic/hh-rlhf
Traceback (most recent call last):
File "/Users/mykytaalekseiev/Work/cicd/cli.py", line 43, in <module>
report = runner.run(**runner_kwargs)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/pipeline/runner.py", line 35, in run
gsk_model, gsk_dataset = loader.load_giskard_model_dataset(**kwargs)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/loaders/huggingface_loader.py", line 53, in load_giskard_model_dataset
feature_mapping = self._get_feature_mapping(hf_model, hf_dataset)
File "/Users/mykytaalekseiev/Work/cicd/giskard_cicd/loaders/huggingface_loader.py", line 128, in _get_feature_mapping
raise RuntimeError(msg)
RuntimeError: Could not find a suitable mapping for feature for `text`.
I tested
OpenAssistant/reward-model-deberta-v3-large-v2
model. Despite the mode having TextClassification type, related datasets do not have the structure of the 'classification' dataset. Thus, during feature mapping (_get_feature_mapping
method) stage next errors are happened, depending on the dataset:openai/summarize_from_feedback
,Dahoas/instruct-synthetic-prompt-responses
openai/webgpt_comparisons
Anthropic/hh-rlhf