Open XueJiang16 opened 3 months ago
Hi, thanks for your attention.
In response to question 1: Since we are trying to compare with llava 158k (famous), our control variable is llava 158k. In fact, we could combine MMR and DataOptim, or I even have other data that could be stacked (data that has proven validity), and the model performance would be better, but for the sake of comparison, we didn't add the training set.
In response to question 2: We have done experimental tests before, and in general, it will increase in other benchmarks, but the increase is not more than 1%, so we think it can't be significantly improved in other general benchmarks.
Thanks for your reply.
So I wonder how much improvement if you combine MMR and DataOptim?
Also, why use less data (MMR v.s. DataOptim) can achieve similar performance on common benchmark? I think this is a very interesting point.
Moreover, I would like to ask you about how to address the issue of using new data to tune an existing model. Should the old instruction data and the new instruction data be trained together? Or, only use new data?
Let's be clear, what part of the data specifically are you referring to when you say Dataoptim? There's a lot of stuff in there about the performance of MMR data once it's joined and how it's joined, so if you're interested, could you send me an email and we might set up a meeting to discuss it?
Thanks for your help! Maybe I have some misunderstanding. My email is csxjiang@comp.hkbu.edu.hk.
Okay
Thanks for your help! Maybe I have some misunderstanding. My email is csxjiang@comp.hkbu.edu.hk.
Hi, I've sent the email.
This is a great work! I notice that Bunny-MMR use MMR data for instruction tuning and original Bunny use DataOptim for instruction tuning.
I have two questions:
why not use both MMR and DataOptim for instruction tuning? By doing this, could Bunny perform better?
Bunny-MMR performs better than original Bunny on MMR benchmark but how about performance comparison on common benchmark?