nyu-mll / jiant-v1-legacy

The jiant toolkit for general-purpose text understanding models
MIT License
21 stars 9 forks source link

support languages other than English #1098

Open jeswan opened 4 years ago

jeswan commented 4 years ago

Issue by ssabzzz Thursday Jun 25, 2020 at 14:13 GMT Originally opened as https://github.com/nyu-mll/jiant/issues/1098


I want to use a multilingual BERT model which apparently is not an option in jiant. (my test dataset is not in english)

Is there a way to achieve this right now?

jeswan commented 4 years ago

Comment by sleepinyourhat Thursday Jun 25, 2020 at 14:23 GMT


I believe there aren't any firm obstacles to setting up XLM-R support or non-English tasks. This isn't trivial, but if you can identify all of the RoBERTa-specific code, you should be able to set up a parallel code path for XLM-R.

Alternately, we're finishing up the initial draft/beta release of jiant 2, which will be public in about two weeks. That will have a somewhat different interface, but it will almost certainly support XLM-R from the start.