fx tracing for heads is not supported

WeiHao97 commented 2 years ago

Environment info

adapter-transformers version: 3.0.1
Platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Python version: 3.9.12
PyTorch version (GPU?): 1.10.2 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Information

Model I am using (Bert, XLNet ...): Roberta

Language I am using the model on (English, Chinese ...): N/A

Adapter setup I am using (if any): RobertaAdapterModel

The problem arises when using:

[ ] the official example scripts: (give details below)
[x] my own modified scripts: (give details below)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[x] my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

first run this script you will get this error:


from transformers.utils import fx
from transformers import AutoAdapterModel 
import inspect

model = AutoAdapterModel.from_pretrained("roberta-base") adapter_name = model.load_adapter("qa/squad2@ukp", config="houlsby") model.set_active_adapters(adapter_name)

input_names = ["input_ids", "attention_mask"] sig = inspect.signature(model.forward) concrete_args = {p.name: None for p in sig.parameters.values() if p.name not in input_names} tracer = fx.HFTracer() traced_graph = tracer.trace(model, concrete_args=concrete_args) traced = torch.fx.GraphModule(model, traced_graph)

TraceError Traceback (most recent call last) Input In [3], in <cell line: 13>() 11 concrete_args = {p.name: None for p in sig.parameters.values() if p.name not in input_names} 12 tracer = fx.HFTracer() ---> 13 traced_graph = tracer.trace(model, concrete_args=concrete_args) 14 traced = torch.fx.GraphModule(model, traced_graph)

File ~/anaconda3/lib/python3.9/site-packages/transformers/utils/fx.py:475, in HFTracer.trace(self, root, concrete_args, method_names) 471 self._autowrap_function_ids.update(set([id(f) for f in autowrap_functions])) 473 self._patch_leaf_functions_for_root(root) --> 475 self.graph = super().trace(root, concrete_args=concrete_args) 477 self._patch_leaf_functions_for_root(root, restore=True) 479 _reset_tensor_methods(self.original_methods)

File ~/anaconda3/lib/python3.9/site-packages/torch/fx/_symbolic_trace.py:615, in Tracer.trace(self, root, concrete_args) 613 for module in self._autowrap_search: 614 _autowrap_check(patcher, module.dict, self._autowrap_function_ids) --> 615 self.create_node('output', 'output', (self.create_arg(fn(*args)),), {}, 616 type_expr=fn.annotations.get('return', None)) 618 self.submodule_paths = None 620 return self.graph

File ~/anaconda3/lib/python3.9/site-packages/transformers/adapters/models/roberta.py:84, in RobertaAdapterModel.forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict, head, kwargs) 81 pooled_output = outputs[1] 83 if head or AdapterSetup.get_context_head_setup() or self.active_head: ---> 84 head_outputs = self.forward_head( 85 head_inputs, 86 head_name=head, 87 attention_mask=attention_mask, 88 return_dict=return_dict, 89 pooled_output=pooled_output, 90 **kwargs, 91 ) 92 return head_outputs 93 else: 94 # in case no head is used just return the output of the base model (including pooler output)

File ~/anaconda3/lib/python3.9/site-packages/torch/fx/proxy.py:248, in Proxy.iter(self) 245 if inst.opname == 'UNPACK_SEQUENCE': 246 return (self[i] for i in range(inst.argval)) # type: ignore[index] --> 248 return self.tracer.iter(self)

File ~/anaconda3/lib/python3.9/site-packages/torch/fx/proxy.py:161, in TracerBase.iter(self, obj) 154 @compatibility(is_backward_compatible=True) 155 def iter(self, obj: 'Proxy') -> Iterator: 156 """Called when a proxy object is being iterated over, such as 157 when used in control flow. Normally we don't know what to do because 158 we don't know the value of the proxy, but a custom tracer can attach more 159 information to the graph node using create_node and can choose to return an iterator. 160 """ --> 161 raise TraceError('Proxy object cannot be iterated. This can be ' 162 'attempted when the Proxy is used in a loop or' 163 ' as a *args or **kwargs function argument. ' 164 'See the torch.fx docs on pytorch.org for a ' 165 'more detailed explanation of what types of ' 166 'control flow can be traced, and check out the' 167 ' Proxy docstring for help troubleshooting ' 168 'Proxy iteration errors')

TraceError: Proxy object cannot be iterated. This can be attempted when the Proxy is used in a loop or as a *args or **kwargs function argument. See the torch.fx docs on pytorch.org for a more detailed explanation of what types of control flow can be traced, and check out the Proxy docstring for help troubleshooting Proxy iteration errors

2. The above error is because 'if head or AdapterSetup.get_context_head_setup() or self.active_head' so I passed the 'head' arg as not None and got another error:
```python
from transformers.utils import fx
from transformers import AutoAdapterModel 
import inspect

model = AutoAdapterModel.from_pretrained("roberta-base")
adapter_name = model.load_adapter("qa/squad2@ukp", config="houlsby")
model.set_active_adapters(adapter_name)

input_names = ["input_ids", "attention_mask", "head"]
sig = inspect.signature(model.forward)
concrete_args = {p.name: None for p in sig.parameters.values() if p.name not in input_names}
tracer = fx.HFTracer()
traced_graph = tracer.trace(model, concrete_args=concrete_args)
traced = torch.fx.GraphModule(model, traced_graph)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Input In [4], in <cell line: 13>()
     11 concrete_args = {p.name: None for p in sig.parameters.values() if p.name not in input_names}
     12 tracer = fx.HFTracer()
---> 13 traced_graph = tracer.trace(model, concrete_args=concrete_args)
     14 traced = torch.fx.GraphModule(model, traced_graph)

File ~/anaconda3/lib/python3.9/site-packages/transformers/utils/fx.py:467, in HFTracer.trace(self, root, concrete_args, method_names)
    464 sig = inspect.signature(root.forward)
    465 input_names = sig.parameters.keys() - concrete_args.keys()
--> 467 self.record(root, input_names, method_names=method_names)
    469 # TODO: adapt the way leaf function are wrapped with the "autowrap function" feature from Tracer.
    470 autowrap_functions = [patched for (_, _, patched) in self._leaf_functions_register.values()]

File ~/anaconda3/lib/python3.9/site-packages/transformers/utils/fx.py:423, in HFTracer.record(self, model, input_names, method_names)
    420 cache_names, original_methods = self._monkey_patch_tensor_methods_for_model_recording(model, method_names)
    421 self.original_methods = original_methods
--> 423 model(**inputs)
    425 _reset_tensor_methods(original_methods)
    427 self.recorded_methods = {
    428     method_name: cache_name for method_name, cache_name in cache_names.items() if hasattr(model, cache_name)
    429 }

File ~/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File ~/anaconda3/lib/python3.9/site-packages/transformers/adapters/models/roberta.py:83, in RobertaAdapterModel.forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict, head, **kwargs)
     80     head_inputs = outputs
     81 pooled_output = outputs[1]
---> 83 if head or AdapterSetup.get_context_head_setup() or self.active_head:
     84     head_outputs = self.forward_head(
     85         head_inputs,
     86         head_name=head,
   (...)
     90         **kwargs,
     91     )
     92     return head_outputs

File ~/anaconda3/lib/python3.9/site-packages/transformers/utils/fx.py:323, in HFTracer._wrap_method_for_model_recording.<locals>.wrapped(*args, **kwargs)
    321     setattr(model, cache_name, [])
    322 cache = getattr(model, cache_name)
--> 323 res = method(*args, **kwargs)
    324 cache.append(res)
    325 return res

RuntimeError: Boolean value of Tensor with more than one value is ambiguous

Expected behavior

This is because passing a boolean tensor to the if statement. So it seems to me that your repo does not support tracing on any heads while the transformer repo does because it does not have this if statement.

adapter-hub-bert commented 2 years ago

This issue has been automatically marked as stale because it has been without activity for 90 days. This issue will be closed in 14 days unless you comment or remove the stale label.

adapter-hub-bert commented 1 year ago

This issue has been automatically marked as stale because it has been without activity for 90 days. This issue will be closed in 14 days unless you comment or remove the stale label.

adapter-hub / adapters