Sometimes the dataset samples an empty string like \n\n which goes into all the steps of the CoT of the validator, causing some weird conversation flows.
This can be easily fixed by implementing a "is string empty" verification before returning the text sampled from the dataset.
Some examples of conversation flows with empty strings:
'\n\nSummarize the preceding context in 5 sentences.\n\n',
'\n\nSummarize the preceding context in 4 sentences.\n\n',
'\n\nSummarize the preceding context in 6 sentences.\n\n',
'\n\nSummarize the preceding context in 7 sentences.\n\n'
Sometimes the dataset samples an empty string like
\n\n
which goes into all the steps of the CoT of the validator, causing some weird conversation flows.This can be easily fixed by implementing a "is string empty" verification before returning the text sampled from the dataset.
Some examples of conversation flows with empty strings:
Location to implement change: https://github.com/opentensor/validators/blob/e422d2a5e402e814e9dd325c4c5b5675cf976380/openvalidators/dataset.py#L30C18-L30C18