Closed austinmw closed 1 year ago
Please try to reduce the usage of lambda
function in the pipeline, which is unpicklable -> can't do multiprocessing.
You can replace your lambda functions with
def map_fn1(t):
return (int(t[0]), " ".join(t[1:]))
def map_fn2(batch):
...
@ejguan But I am literally copying the "sanity check" example directly from TorchData's GitHub homepage..
I guess that is not an up-to-date/recommended way to use this library?
Fair point that we should improve the part of sanity check. cc: @NivekT since you are working on README right now, we might remove the sanity check part and ask users to refer to examples/online doc.
For reference, we have a folder of examples in https://github.com/pytorch/data/tree/main/examples Our online doc has amount of examples as well https://pytorch.org/data/main/
Thanks, I will refer to those examples!
Added the fix to #954
Closing as the sanity check has been removed from README
🐛 Describe the bug
When I run:
I get the following:
Versions
Python 3.8.0 torch 2.0.0.dev20230119+cu116 torchdata # 0.6.0.dev20230119