Is your feature request related to a problem? Please describe.
Hi is there a reason or motivation behind the probabilities in the data files?
I am curious about making a chat-instruct bot and considering of training with a new set of probabilities.
It seems to me that you are doing weighted sampling. Is this to motivate randomness among the tasks?
Is there some way to find the proper probabilities even we are interested in a sub-set of those tasks?
Or is this just a heuristic that worked?
Describe the solution you'd like
How to make a general instruct bot orientated towards a set of sub tasks (not all the tasks mentioned).
A more refined fine-tuning if that makes sense
Describe alternatives you've considered
Not sure about an alternative. The papers are not super clear about this
Is your feature request related to a problem? Please describe.
Hi is there a reason or motivation behind the probabilities in the data files? I am curious about making a chat-instruct bot and considering of training with a new set of probabilities.
It seems to me that you are doing weighted sampling. Is this to motivate randomness among the tasks? Is there some way to find the proper probabilities even we are interested in a sub-set of those tasks? Or is this just a heuristic that worked?
Describe the solution you'd like How to make a general instruct bot orientated towards a set of sub tasks (not all the tasks mentioned). A more refined fine-tuning if that makes sense
Describe alternatives you've considered Not sure about an alternative. The papers are not super clear about this
Additional context None