I'm looking for a checkpoint for an mPLUG model suitable for processing multilingual (mainly Chinese) videos. Does such a checkpoint already exist? If not, are there other model checkpoints suitable for Chinese video caption? Or, is it possible to develop such a checkpoint for the mPLUG model in the future?
I'm looking for a checkpoint for an mPLUG model suitable for processing multilingual (mainly Chinese) videos. Does such a checkpoint already exist? If not, are there other model checkpoints suitable for Chinese video caption? Or, is it possible to develop such a checkpoint for the mPLUG model in the future?