Training Stages
Stage 1: Train a language model on abstracts alone, for it to learn the vocabulary of our scientific domain.
Stage 2: Fine-tune on abstract-title pairs
Stage 3: Fine-tune further using reinforcement learning ?? Maybe with our metrics as reward function?
Training Stages Stage 1: Train a language model on abstracts alone, for it to learn the vocabulary of our scientific domain. Stage 2: Fine-tune on abstract-title pairs Stage 3: Fine-tune further using reinforcement learning ?? Maybe with our metrics as reward function?