PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.24k stars 5.59k forks source link

DataLoader raise an error when set num_workers > 0 #44145

Open Fordacre opened 2 years ago

Fordacre commented 2 years ago

请提出你的问题 Please ask your question

Using DataLoader for a custom IterableDataset with num_workers > 0 will raise an error as follows. image But it is okay when setting num_workers = 0 or loading a custom paddle.io.Dataset. Paddle Version 2.2.2

paddle-bot-old[bot] commented 2 years ago

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

rainyfly commented 2 years ago

你好,遇到该问题的原因通常是因为Dataset返回的数据存在问题,您可以仔细检查一下dataset返回的各项数据,检查一下各项数据是否可以通过paddle.to_tensor转换为tensor

Fordacre commented 2 years ago

@rainyfly 测试起来转化为tensor并不存在问题。而且另外比较疑惑为什么当设置使用的paddle.io.Dataset而不是paddle.io.IterableDataset是却没有问题呢,这两者有什么区别嘛。

rainyfly commented 2 years ago
image

应该是IterableDataset获取数据的方式不是通过getitem导致的

Fordacre commented 2 years ago

图片看不到呀,上传失败了。 @rainyfly 所以这个问题有能够使用IterableDataset设置wokers>0的解决方案嘛。而且我看文档里是可以这样使用的。