airctic / icevision

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come
https://airctic.github.io/icevision/
Apache License 2.0
848 stars 150 forks source link

Fastai learner.export() fails as icevision dataset/dataloader does not have new_empty() method #704

Closed adamfarquhar closed 2 years ago

adamfarquhar commented 3 years ago

🐛 Bug

Describe the bug The fastai leaner's export routine requires a new_empty() method on the dataset (doc: Create a new empty version of the self, keeping only the transforms). The fastai implementation accesses the fastai dataset's tls attribute (transformed lists), which has a new_empty() method.

Not sure how best to fix this in icevision. The icevision dataloader has a new method, but this retains the records.

To Reproduce Create a fastai learner with any icevision model and dataset. Try to learn.export().

Expected behavior learn.export() should save the model without data.

Additional context

learn = faster_rcnn.fastai.learner(
    dls=[train_dl, valid_dl],
    model=model)
learn.export()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-fa5b61306ef3> in <module>
----> 1 learn.export()

~/anaconda3/envs/dlm/lib/python3.8/site-packages/fastai/learner.py in export(self, fname, pickle_protocol)
    536     self._end_cleanup()
    537     old_dbunch = self.dls
--> 538     self.dls = self.dls.new_empty()
    539     state = self.opt.state_dict() if self.opt is not None else None
    540     self.opt = None

~/anaconda3/envs/dlm/lib/python3.8/site-packages/fastai/data/core.py in new_empty(self)
    144     def __getitem__(self, i): return self.loaders[i]
    145     def new_empty(self):
--> 146         loaders = [dl.new(dl.dataset.new_empty()) for dl in self.loaders]
    147         return type(self)(*loaders, path=self.path, device=self.device)
    148 

~/anaconda3/envs/dlm/lib/python3.8/site-packages/fastai/data/core.py in <listcomp>(.0)
    144     def __getitem__(self, i): return self.loaders[i]
    145     def new_empty(self):
--> 146         loaders = [dl.new(dl.dataset.new_empty()) for dl in self.loaders]
    147         return type(self)(*loaders, path=self.path, device=self.device)
    148 
lgvaz commented 3 years ago

We can patch a new_empty method to our dataloaders inside convert_dataloader_to_fastai.py, what do we need to return in this function?

adamfarquhar commented 3 years ago

I am not completely sure, but I think it should return a copy of the dataloader with an empty dataset. I.e., it is empty except that it retains the transforms. I guess that the idea is that the learner is being exported for inference. It should include the model and a variety of params, but it should not include the data that was present in the training environment. This is implemented by creating dataloaders that are empty of data.

adamfarquhar commented 3 years ago

Even if we temporarily define null values of dataset.new_empty (as lambda: None), we stumble on the inner function splitter in the fastai adaptor. The inner function can't be pickled using the default module. In fastai 2.1.4, the the pickle_module is not available as an argument. This has subsequently been updated. But then providing 'pickle_module=dill' stumbles on PicklingError: Can't pickle typing.AbstractContextManager: it's not found as typing.AbstractContextManager.

So, the best solution will be to define a proper new_empty and then eliminate the inner functions (which don't in this case add any value and could be replaced with regular functions).

dnth commented 2 years ago

Hi everyone.. there is still no solution for this at the moment right? What would be your recommendation as a workaround? Thanks!

FraPochetti commented 2 years ago

@ai-fast-track do we need this one at all? I believe you have figured out the saving+reloading model thing.

FraPochetti commented 2 years ago

Closing as saving+reloading is taken care of here.