jamesdolezal / slideflow

Deep learning library for digital pathology, with both Tensorflow and PyTorch support.
https://slideflow.dev
GNU General Public License v3.0
231 stars 38 forks source link

TypeError: __init__() got an unexpected keyword argument 'multi_gpu' #298

Closed ltzg closed 1 year ago

ltzg commented 1 year ago

Multiple GPUs cannot be used with TensorFlow, one GPU is OK

hp = sf.ModelParams( tile_px=256, tile_um=256,

model='densenet',

batch_size=32,
epochs=[10],
multi_gpu=True

) Traceback (most recent call last): File "/home/xl/下载/pycharm-community-2023.1.2/plugins/python-ce/helpers/pydev/pydevconsole.py", line 364, in runcode coro = func() File "", line 1, in File "/home/xl/anaconda3/envs/sftf/lib/python3.9/site-packages/slideflow/model/tensorflow.py", line 148, in init super().init(*args, **kwargs) TypeError: init() got an unexpected keyword argument 'multi_gpu'

TensorFlow==2.10

jamesdolezal commented 1 year ago

Hi Itzg,

The multi_gpu=True argument should be used for Project.train(), not ModelParams:

hp = sf.ModelParams(...)
Project.train(..., multi_gpu=True)
ltzg commented 1 year ago
    Thank you very much for your reply. I got an new  ErrorTraceback (most recent call last):  File "/home/xl/下载/pycharm-community-2023.1.2/plugins/python-ce/helpers/pydev/pydevconsole.py", line 364, in runcode    coro = func()  File "<input>", line 1, in <module>  File "/home/xl/anaconda3/envs/sftf/lib/python3.9/site-packages/slideflow/project.py", line 3426, in train    self._train_hp(  File "/home/xl/anaconda3/envs/sftf/lib/python3.9/site-packages/slideflow/project.py", line 713, in _train_hp    self._train_split(dataset, hp, val_settings, s_args)  File "/home/xl/anaconda3/envs/sftf/lib/python3.9/site-packages/slideflow/project.py", line 937, in _train_split    project_utils._train_worker(  File "/home/xl/anaconda3/envs/sftf/lib/python3.9/site-packages/slideflow/project_utils.py", line 147, in _train_worker    results = trainer.train(train_dts, val_dts, **training_kw)  File "/home/xl/anaconda3/envs/sftf/lib/python3.9/site-packages/slideflow/model/tensorflow.py", line 1763, in train    atexit.register(strategy._extended._collective_ops._pool.close)AttributeError: &apos;CollectiveAllReduce&apos; object has no attribute &apos;_pool&apos;

---- Replied Message ----

     From 

        James ***@***.***>

     Date 

    6/20/2023 08:56

     To 

        ***@***.***>

     Cc 

        ***@***.***>
        ,

        ***@***.***>

     Subject 

          Re: [jamesdolezal/slideflow] TypeError: __init__() got an unexpected keyword argument 'multi_gpu' (Issue #298)

Hi Itzg, The multi_gpu=True argument should be used for Project.train(), not ModelParams: hp = sf.ModelParams(...) Project.train(..., multi_gpu=True)

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

jamesdolezal commented 1 year ago

The above error looks like a multiprocessing issue. Restarting the development environment or python kernel should fix the problem.