Open AlbusChen opened 1 month ago
Thank you for your interest in our work! Currently, our code supports only T5-based models. As a result, causal models may not function correctly. Implementing support for causal models requires different data preprocessing and generation codes, which are not yet included in our repository. We will update our repository as soon as possible to include these features. Thank you for your understanding and patience!
Hi,
I am trying to use this framework on causal models such as llama based models and other LLMs. For my case, I use Tinyllama and Pythia to replace the T5 model in the original pipeline (TinyLlama-1.1B-Chat-v1.0 and EleutherAI/pythia-1.4b).
However, after I replace the model and run through all the steps provided in the code, which is, using the reasoning from GPT to fine-tune a smaller model (in this case Tinyllama and Pythia) and also use the external knowledge from KB. The response of the fine-tuned model is not readable and performs badly. For example, in medqa dataset, using wikipedia KB, and the provided reranker, the distilled model response text like
"A correct: that5). ( answer5C is the to of answer is the A is root ( A:). C, - A
also: for:. C, :) is of: isC.: 1 C a ThereforeC =:),,, C"
which is not a readable sentence and definitely fail in this task. I want to know why this happen and hope you can give me some possible explanations.
Other details: