CarperAI / cheese

Used for adaptive human in the loop evaluation of language and embedding models.
MIT License
300 stars 24 forks source link

Saving progress for datasets #30

Closed shahbuland closed 1 year ago

shahbuland commented 1 year ago

Saving progress for datasets, namely IterablePipelines, is currently a bit clunky. The output dataset is agnostic of progress/location in source. With respect to the source iterator being read from, all that is really being saved is an index in the dataset being read from. Currently naively running next on iterator to get back to whatever index was saved. Leaving a note here to revisit this later as it might have unforeseen consequences at scale.

shahbuland commented 1 year ago

31 Partially addresses, needs more debugging to ensure it is consistent and fault tolerant across all pipelines

shahbuland commented 1 year ago
shahbuland commented 1 year ago

Solved with #37