nyu-mll / jiant-v1-legacy

The jiant toolkit for general-purpose text understanding models
MIT License
21 stars 9 forks source link

[Feature request] Make it easier to load a model other than from pytorch-huggingface #933

Open jeswan opened 4 years ago

jeswan commented 4 years ago

Issue by pruksmhc Monday Oct 14, 2019 at 22:59 GMT Originally opened as https://github.com/nyu-mll/jiant/issues/933


If there is another BERT-based model (such as clinicalBERT) that is currently not part of huggingface, it is very hard to add it. We should make the code more generalizable (for example, the current way we build vocabulary is quite specific to huggingface).

jeswan commented 4 years ago

Comment by sleepinyourhat Monday Oct 14, 2019 at 23:04 GMT


Quickly: In that case, why not just make a PR to Huggingface?

On Mon, Oct 14, 2019 at 3:59 PM Yada Pruksachatkun notifications@github.com wrote:

If there is another BERT-based model (such as clinicalBERT) that is currently not part of huggingface, it is very hard to add it. We should make the code more generalizable (for example, the current way we build vocabulary is quite specific to huggingface).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_nyu-2Dmll_jiant_issues_933-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAAJZSWNCBJBA3PLWXTQWOWLQOT2VRA5CNFSM4JAU6LMKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HRW4NPA&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=sCzLyHdE8zgQwk2-sKwA1w&m=NbN67J_fedHXk-O0zqLaRfd1KlsJZFkBynpG8-M1A74&s=SNVpv9scxgVlazlF8mfciJr5qutIuADFXYLNyb69TpM&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAJZSWNFKKNCWFNVO5EN3ETQOT2VRANCNFSM4JAU6LMA&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=sCzLyHdE8zgQwk2-sKwA1w&m=NbN67J_fedHXk-O0zqLaRfd1KlsJZFkBynpG8-M1A74&s=qwuN03MerhhJ1UF5kLZbUwwNoHLc8_ruPrjCZVlnGLY&e= .

jeswan commented 4 years ago

Comment by pruksmhc Thursday Oct 17, 2019 at 12:24 GMT


Because that would require understanding a second (and much larger) repo :S. I don't think this change will be too big, the tricky parts would be in handling vocab and making sure we allow input_module to be a path to a model as well as the current options.

jeswan commented 4 years ago

Comment by sleepinyourhat Monday Oct 28, 2019 at 23:20 GMT


I'm still nervous. The Huggingface repo/team has thought pretty hard about how to do this well, and I can imagine a lot of things we could get wrong if we don't prioritize this. I'd vote fairly strongly against doing this for anything that's within the reasonable scope of the Huggingface repo.

(For smaller/simpler models without checkpoints/special tokenizers/special optimizers, this shouldn't really be an issue.)