Open Elena-Qiu opened 3 years ago
The code is here: https://github.com/intel-analytics/analytics-zoo/pull/3918 @qiuxin2012 Do we support iterator as data?
Why not dataloader? Iterator is not supported.
I have changed the iterator to dataloader and it runs successfully with two workers using torch_distributed backend. But the bigdl backend still gets the same error as above.
@qiuxin2012 Take a look.
Use something like train_loader_creator, but not train_iter. train_iter is too big be pickled.
I tried with the train_loader_creator and test_loader_creator but still got the same error. From the error information, it seems that the error occurs when running "self.model = TorchModel.from_pytorch(model)" and calling "o41.createTorchModel". Maybe it has something to do with model creating? @qiuxin2012
Looks your model is too big to be pickled, please use a model creator function instead of model instance.
https://github.com/intel-analytics/analytics-zoo/blob/ee24ffcc17458490da1a42bc6ad6e5f881d41106/pyzoo/zoo/orca/learn/pytorch/estimator.py#L275 The estimator shouldn't create model the model instance here, we should pickle a model creator function to bytes and pass it to executor.(pickle only support 155MB data) @Le-Zheng
Next step:
I tried with the latest estimator.py but I still got the same error as before. The latest estimator.py is with
if isinstance(model, types.FunctionType):
def model_creator(self):
return model(self.config)
model = model_creator(self)
Take a look @qiuxin2012 @Le-Zheng
When running the orca learn sentiment example using bigdl backend, will get the following error:
The bigdl backend is implemented as follows: