Closed jackcook closed 2 weeks ago
LGTM to me from reading the code and the logic seems sound.
I'd be interested in a review from someone involved in https://github.com/AnswerDotAI/bert24/pull/55 -- @staghado @warner-benjamin @NohTow ?
@jackcook Looks like the original codebase had a simplified version of BertForMultipleChoice which is missing the input reshaping from the Transformers implementation. Port that over and it should work.
Nice catch, just pushed it and all the tests are passing now
This PR fixes the tests in
tests/test_glue.py
, following discussion in #55. Following up on some comments in #34, I've realized whyshuffle
was originally set toFalse
in all of our tasks: it can't be set toTrue
whensampler
is also in use (link).I also added smoketests for SuperGLUE, however the tests in this file are currently failing due to an issue involving our unpadding implementation. The issue has to do with the inputs for multiple choice tasks: e.g. when using
BertForMultipleChoice
, rather than passing in inputs of shape(B S)
(batch size, seq length), we need to be able to pass in inputs of shape(B C S)
(batch size, multiple choice options, seq length). This results in a CUDA indexing error in thetorch.gather
call on line 35 ofsrc/bert_padding.py
. I unfortunately don't have enough time to debug and fix it right now, and would really appreciate some help with it -- I figured it would be worth making the PR anyway to fix the GLUE tests in the meantime.