Closed canyilmaz90 closed 5 years ago
I do not think that this is possible. You can extend the number of characters or change the probability distribution via fine tuning but I doubt that it is possible to exclude characters which are present in the original model you continue from.
Thank you for replying. I couldn't find any whitelisting option too. Fortunately, if you have large amount of data, it excludes the unnecessary chars by itself.
Thanks again!
Is that possible finetuning only for characters that exist in your char set? To be more clear, I have a dataset which consists of some latin characters which is a subset of original english lang data and I don't want to have the other characters, or symbols. Can I do that still using finetune with START_MODEL. @kba @wrznr @shreeshrii