Suppose we are keeping our datasets and models in VersionedDataset format, but we want to allow users to interface with our package without having to use our format. As long as the user can provide us with the training feature and label tensors, we want to be able to publish a VersionedDataset. Create a subclass of VersionedDatasetBuilder that uses a dummy dataset path (e.g., dne) and data processor (one that simply does nothing when any of its methods are called), taking instead as its __init__ arguments the feature and label dictionaries. We can publish VersionedDatasets from this representation, as long as the dataset copy strategy is always "link" (override this method and raise an error if args are invalid, then call super). Perhaps this could be called a PathlessVersionedDatasetBuilder.
Note that there is no analogue to the VersionedModelBuilder. The VersionedModelBuilder takes a VersionedDataset, a model, and an optional TrainingConfig. It is not possible, based on our current view of VersionedModels, to have a VersionedModel without a corresponding VersionedDataset, so this argument is not optional. A VersionedModel necessarily needs a model, so that is not optional either. And the TrainingConfig is already optional.
Suppose we are keeping our datasets and models in VersionedDataset format, but we want to allow users to interface with our package without having to use our format. As long as the user can provide us with the training feature and label tensors, we want to be able to publish a VersionedDataset. Create a subclass of VersionedDatasetBuilder that uses a dummy dataset path (e.g.,
dne
) and data processor (one that simply does nothing when any of its methods are called), taking instead as its__init__
arguments the feature and label dictionaries. We can publish VersionedDatasets from this representation, as long as the dataset copy strategy is always "link" (override this method and raise an error if args are invalid, then call super). Perhaps this could be called aPathlessVersionedDatasetBuilder
.Note that there is no analogue to the VersionedModelBuilder. The VersionedModelBuilder takes a VersionedDataset, a model, and an optional TrainingConfig. It is not possible, based on our current view of VersionedModels, to have a VersionedModel without a corresponding VersionedDataset, so this argument is not optional. A VersionedModel necessarily needs a model, so that is not optional either. And the TrainingConfig is already optional.