opentensor / validators

Repository for bittensor validators
https://www.bittensor.com/
MIT License
14 stars 9 forks source link

Dataset returning empty strings #102

Closed p-ferreira closed 1 year ago

p-ferreira commented 1 year ago

Sometimes the dataset samples an empty string like \n\n which goes into all the steps of the CoT of the validator, causing some weird conversation flows.

This can be easily fixed by implementing a "is string empty" verification before returning the text sampled from the dataset.

Some examples of conversation flows with empty strings:

'\n\nSummarize the preceding context in 5 sentences.\n\n',
 '\n\nSummarize the preceding context in 4 sentences.\n\n',
'\n\nSummarize the preceding context in 6 sentences.\n\n',
 '\n\nSummarize the preceding context in 7 sentences.\n\n'

Location to implement change: https://github.com/opentensor/validators/blob/e422d2a5e402e814e9dd325c4c5b5675cf976380/openvalidators/dataset.py#L30C18-L30C18