Open hhoeflin opened 1 year ago
Just wanted to ping about this issue. Would be great to hear the development teams perspective. Even after looking into it more, it still appears to me that most of the functionality provided could be exposed as individual functions.
Would be great to know if I am missing something or misunderstand about the functionality of torchdata.
Thanks
🚀 The feature
For
IterDataPipe
, the.map
maps a function over the items of an iterable. where the function has the formOther basic building blocks could be
.pipe
,.iter_map
and.comsume
. where.pipe
would takef: Iterable -> Iterable
.iter_map
takesf: Any -> Iterable
.comsume
takesf: Iterable -> Any
Motivation, pitch
Such an approach would allow for more flexible functional programming and would reduce most currently provided
IterDataPipe
classes to a simple functional call. For exampleThe
Enumerator
class would becomeThis would immediately enable to use all itertools functions in this context.
The
TarArchiveLoader
could becomeI believe using this approach, almost all provided classes could be written using less boilerplate using generator functions (essentially just writing the code inside
__iter__
as a standalone generator function, possibly curried for convenience if other parameters are being used).Would be great to hear if this was considered? Thanks!
Alternatives
The
.pipe
can already be written asbut I believe this would be a lot less nice than the above
Additional context
No response