Currently, we have DataLoader abstract like class (along with other concrete implementations in TensorDataLoader and MultiTensorDataLoader in SwiftSyft that is analogous to PyTorch's torch.utils.data.DataLoader. These classes take in a collection of tensors, shuffles them and batches them together. The SwiftSyft implementation of DataLoaders are equal to a PyTorch DataLoader that takes a map-style dataset since those dataset types can be shuffled.
Another set of DataLoaders we need will be ones that can take in a lazily evaluated sequence of elements. Similar to the current DataLoader, the new LazyDataLoader will instead take in a Swift Sequence of TorchTensors that cannot be shuffled (because they are lazily evaluated) and batch them similar to the logic in the current implementation of SwiftSyft DataLoader.
API
[ ] Abstract class LazyDataLoader (similar to current DataLoader but takes in a Sequence)
[ ] Concrete subclass LazyTensorDataLoader (takes a Sequence of tensors and batches them. There will be no shuffle parameter)
[ ] Concrete subclass LazyMultiTensorDataLoader (takes a Sequence of arrays of tensors and batches them. There will be noshuffle parameter)
What alternatives have you considered?
Considered implementing all the logic in one class but it was too difficult by just using generics.
Feature Description
Currently, we have
DataLoader
abstract like class (along with other concrete implementations inTensorDataLoader
andMultiTensorDataLoader
in SwiftSyft that is analogous to PyTorch'storch.utils.data.DataLoader
. These classes take in a collection of tensors, shuffles them and batches them together. TheSwiftSyft
implementation of DataLoaders are equal to a PyTorchDataLoader
that takes a map-style dataset since those dataset types can be shuffled.Another set of
DataLoader
s we need will be ones that can take in a lazily evaluated sequence of elements. Similar to the currentDataLoader
, the newLazyDataLoader
will instead take in a SwiftSequence
ofTorchTensor
s that cannot be shuffled (because they are lazily evaluated) and batch them similar to the logic in the current implementation of SwiftSyftDataLoader
.API
LazyDataLoader
(similar to currentDataLoader
but takes in aSequence
)LazyTensorDataLoader
(takes aSequence
of tensors and batches them. There will be noshuffle
parameter)LazyMultiTensorDataLoader
(takes aSequence
of arrays of tensors and batches them. There will be noshuffle
parameter)What alternatives have you considered?
Considered implementing all the logic in one class but it was too difficult by just using generics.