Closed DominikVogel closed 2 years ago
That is because BlearnerForSequenceClassification
is part of the high-level API designed for classification tasks (single or multilabel), not regression.
However, we can dip down into the mid-level API for a regression problem like this ...
model_cls = AutoModelForSequenceClassification
pretrained_model_name = "distilroberta-base" # "distilbert-base-uncased" "bert-base-uncased"
hf_arch, hf_config, hf_tokenizer, hf_model = BLURR.get_hf_objects(pretrained_model_name, model_cls=model_cls)
blocks = (HF_TextBlock(hf_arch, hf_config, hf_tokenizer, hf_model), RegressionBlock)
dblock = DataBlock(blocks=blocks, get_x=ItemGetter("review_body"), get_y=ItemGetter("stars"), splitter=RandomSplitter(seed=42))
dls = dblock.dataloaders(raw_ds, bs=4)
model = HF_BaseModelWrapper(hf_model)
learn = Learner(
dls,
model,
opt_func=partial(OptimWrapper, opt=torch.optim.Adam),
loss_func=MSELossFlat(),
metrics=[rmse],
cbs=[HF_BaseModelCallback],
splitter=hf_splitter,
)
learn = learn.to_fp16()
learn.fit_one_cycle(1, lr_max=3e-5)
Give it a try and lmk how it goes. If you end up turning this into a blog post, it would be great to share with other folks who might be struggling with setting up blurr to work with regression. :)
Thanks a lot for the help! I did not realize I need to use the mid-level API.
Your code almost works. However, it fails when I try to train the model (last line). I get the follwing error message:
RuntimeError: The size of tensor a (8) must match the size of tensor b (4) at non-singleton dimension 0
I tried to change the batch size to 8 but this results in a similar error:
RuntimeError: The size of tensor a (16) must match the size of tensor b (8) at non-singleton dimension 0
After studying your doc for some time I found the solution. The number of labels was not specified. The final code looks like this:
n_lbls = 1
model_cls = AutoModelForSequenceClassification
pretrained_model_name = "distilroberta-base" # "distilbert-base-uncased" "bert-base-uncased"
hf_arch, hf_config, hf_tokenizer, hf_model = BLURR.get_hf_objects(pretrained_model_name,
model_cls=model_cls,
config_kwargs={'num_labels': n_lbls})
blocks = (HF_TextBlock(hf_arch, hf_config, hf_tokenizer, hf_model), RegressionBlock)
dblock = DataBlock(blocks=blocks, get_x=ItemGetter("review_body"), get_y=ItemGetter("stars"), splitter=RandomSplitter(seed=42))
dls = dblock.dataloaders(raw_ds, bs=4)
model = HF_BaseModelWrapper(hf_model)
learn = Learner(
dls,
model,
opt_func=partial(OptimWrapper, opt=torch.optim.Adam),
loss_func=MSELossFlat(),
metrics=[rmse],
cbs=[HF_BaseModelCallback],
splitter=hf_splitter,
)
learn = learn.to_fp16()
learn.fit_one_cycle(1, lr_max=3e-5)
Now I will try to understand what I did with the code and adapt it to my data. I will definetly write a blog post. That's the least I owe the community :-). Thanks a lot!
I'm going to close this out if all is good. lmk.
Hi there, I am a beginner and try to teach myself to use blurr for a regression task. My goal is to assign a score (1-5) to a sequence instead of assigning it to a category (e.g., positive or negative). As a starting point, I used the doc’s code on the GLUE benchmarks (https://ohmeow.github.io/blurr/examples-high-level-api.html) and tried to adapt it. I made the following changes to the code:
raw_datasets = load_dataset("amazon_reviews_multi", "en")
learn_kwargs = { 'metrics': [rmse], 'loss_func': MSELossFlat() }
n_labels=1
)Unfortunately, the code does not work. When I try the build the learner, I get the following error:
I would be very thankful if somebody could help me set up blurr for a regression task!
Here is the Colab notebook: https://colab.research.google.com/drive/1qCwv-nE7JxYXsWOR9gsXZfM8k4Gx32tt?usp=sharing
This is the full code: