Closed samx81 closed 3 years ago
It looks like the dependencies break it.
Could you run this to see the versions of them:
>>> import plkit, cmdy, varname
>>> print(plkit.__version__, cmdy.__version__, varname.__version__)
As for the asteroid model, it is possible to run it with SGERunner. But you will need to wrap around a little bit:
system = System(model, optimizer, loss, train_loader, val_loader)
has to be wrapped into the structure in the boilerplate. Instead of initializing the model and data in advance, you may need to use the plkit.DataModule
and plkit.Model
to wrap them, and use a universal config
. The runner will instantiate the model and data objects.
@samx81
I believe this issue is solved by the latest varname
.
Could you try again after upgrading varname
by pip install -U varname
?
Thanks for the fix!
SGERunner now runs properly, but I found that if I use GPUs to run a task, regardless of using SGERunner, it would return Segmentation fault (core dumped)
at the end of the program, though it seems doesn't affect the result.
Finally thanks for the advice on running Asteroid package, I'll try if I can make it fit in the wrapper.
Hi, I'm trying to get SGE runner working with the minimal example which runs MNIST task. With LocalRunner it runs successfully, but no luck with SGERunner.
Do I need to pass additional arguments to make it work?
Error Messages
Process
plkit.run()
Expected result
Successfully submit jobs to grid engine
Possible Fix
I've dug into the "executing" package and found that
Source.executing(frame)
will return aexect.node == None
, maybe I need extra configuration on my lab servers to make it work?Others Question
Since plkit is a wrapper for pytorch-lightning, is it possible to train pytorch-lightning based toolkit model(like https://github.com/asteroid-team/asteroid/ ) by plkit's SGERunner to speed up the process?