Closed AlanAboudib closed 3 years ago
Hi Alan, I had already started some work on this, but was later notified by Nilansh that the PyGrid nodes weren't working properly, so I dropped it temporarily. I'd like to take this one again
It is yours @hershd23
@AlanAboudib , @Nilanshrajput I have already used pickle and dill and they both throw an error as to that the pipeline object is of a type they can't save. Any fixes?
Error :-
Can't pickle <class 'syft.frameworks.torch.hook.hook.TorchHook._hook_worker_methods.<locals>.Torch'>: it's not found as syft.frameworks.torch.hook.hook.TorchHook._hook_worker_methods.<locals>.Torch
@hershd23 check the simplify function, save what that function returns and while loading use its detail function for each object that you are trying to save.
@hershd23 the Pipeline
object does not contain the states of the individual pipes. you should cache it in the same way it is deployed. simplified Pipeline
object + simplified State
objects belonging to that pipeline. If the local worker does not have necessary permissions to download the state (due to restricted access
property of the State
object) then that state is not cached
Feature Description
When a pipeline is loaded from PyGrid, we should be able to save the states of the pipe component to which the local worker has access to.
Saving can be done on a call to
nlp.save(destination = 'local')
ornlp.save()
which considerdestination = 'local'
as the default.The reason we introduce the
destination
argument is that I think would will need other data lake storage types in the future such ass3
to back up pipelines.When loading a model with `nlp.load(pipeline_name = 'syfertext_sentiment', cache = Union[None, 'local', 's3'])
For this issue, onlyimplementation of
'local'
andNone
are required.Notice that, for this to work properly, each pipeline should have a version, and a timestamp in its tags. If the cache version and timestamp does not corresponds to the PyGrid one, the cache is not used.
Is your feature request related to a problem?
Pipeline loading might be time consuming. It would be impractical to load the same pipeline several times during testing.
Additional Context
merge to the
syfertext_0.1.0
branch