Open PabloMessina opened 1 year ago
@fepegar could you route that question please?
@corcra @Shruthi42 @ozan-oktay @qianchu
Could you please share your thoughts?
Hello, you can run the following
from transformers import BertLMHeadModel
model = BertLMHeadModel.from_pretrained(<cxr-bert model path>, is_decoder=True)
to initialise a decoder model, and then you can finetune this model for generation.
I have many experiments in mind where I need to condition a Transformer Decoder with some input (e.g. image features, discrete binary labels, a one-hot representing some concept, a question, etc.) in order to generate an output (e.g. a report, an answer). I have already implemented many of these ideas using my own custom Transformer Decoder based on PyTorch's standard implementation. However, now I would like to leverage existing pre-trained language models, instead of my custom implementation that always starts from scratch. Thus, I was wondering if there is an easy way to adapt CXR-BERT (or any other model that you guys would recommend) for text generation, given some input. For example, let's say I have a binary vector encoding certain information, and I want to fine-tune CXR-BERT to generate a paragraph verbalizing the information contained in this binary vector. The paragraph could be, for example, a radiology report, so it makes sense that fine-tuning a model like CXR-BERT for report generation should outperform a custom Transformer Decoder from PyTorch trained from scratch.
Questions:
Thank you very much in advance.