Yep, that's what the [classification heads](https://github.com/pytorch/fairseq/blob/master/fairseq/models/roberta/model.py#L273) do

Yep, that's what the classification heads do

Originally posted by @lematt1991 in https://github.com/pytorch/fairseq/issues/2302#issuecomment-654237044

why ''last_layer_features = xlmr.extract_features(zh_tokens)'' just jump to the function of ''fairseq.models.roberta.hub_interface.RobertaHubInterface.extract_features'' ? what the difference between ''fairseq.models.roberta.model.RobertaEncoder.extract_features'' and ''fairseq.models.roberta.hub_interface.RobertaHubInterface.extract_features'' ?

facebookresearch / fairseq

Yep, that's what the [classification heads](https://github.com/pytorch/fairseq/blob/master/fairseq/models/roberta/model.py#L273) do #3056