RES-2190: Fix labels set to -100 in finetuning tasks

numenta / nupic.research

Experimental algorithms. Unsupported.

https://nupicresearch.readthedocs.io

GNU Affero General Public License v3.0

104 stars 60 forks source link

RES-2190: Fix labels set to -100 in finetuning tasks #517

Closed benja-matic closed 3 years ago

benja-matic commented 3 years ago

Regarding RES-2190. Looks like HuggingFace trainer looks at args.dataloader_drop_last for train, and eval loaders. Workaround is to turn flip that to False during evaluation in compute_metrics_task, flip it back to True at the end (if it was originally true). A few minor ignorable formatting edits will merge in as well. Finally, there's a finetuning experiment for tiny_bert50k.

mvacaporale commented 3 years ago

Good find. The solution is simple and straightforward. Note, there could be others ways. For one, we could override get_eval_dataloader, but I think that would a bit cumbersome. I think yours will work quite well for our needs.

mvacaporale commented 3 years ago

@benja-matic Do you know how this change affects the fine-tuning results? We should probably rerun those from the README (including bert_1mi, bert_100k, sparse_80%_kd_onecycle_lr_rigl, and sparse_80%_kd_onecycle_lr) @lucasosouza What do you think? Is this necessary? My concern is that it may make it harder to contextualize new results if we don't rerun previous fine-tuning experiments. Or at the very least, we should rerun one or two just to make sure the results are negligibly affected.

benja-matic commented 3 years ago

@benja-matic Do you know how this change affects the fine-tuning results? We should probably rerun those from the README (including bert_1mi, bert_100k, sparse_80%_kd_onecycle_lr_rigl, and sparse_80%_kd_onecycle_lr) @lucasosouza What do you think? Is this necessary? My concern is that it may make it harder to contextualize new results if we don't rerun previous fine-tuning experiments. Or at the very least, we should rerun one or two just to make sure the results are negligibly affected.

@mvacaporale don't know just yet. I'm rerunning fine tuning on baseline models this afternoon.