dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.58k stars 719 forks source link

Memory leak with Client object #2480

Open mpeleshenko opened 5 years ago

mpeleshenko commented 5 years ago

For some reason when I try to close and delete a Client object that uses a LocalCluster, I still see that it exists in memory even after gc.collect(). Am I correctly cleaning up the client object?

import gc
import platform
import sys
import weakref

import dask
import distributed
import tornado
from distributed import LocalCluster, Client

def main():
    cluster = LocalCluster(
        diagnostics_port=None,
        ip='127.0.0.1',
        n_workers=1,
        scheduler_port=0,
        threads_per_worker=1,
    )
    client = Client(
        address=cluster,
    )
    wr = weakref.ref(client)
    client.close()
    del client
    gc.collect()
    assert wr() is None

if __name__ == '__main__':
    print('OS:', platform.platform())
    print('Python Version:', sys.version)
    print('Dask Version:', dask.__version__)
    print('Distributed Version:', distributed.__version__)
    print('Tornado Version:', tornado.version)
    main()

Output

OS: Windows-7-6.1.7601-SP1
Python Version: 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:05:27) [MSC v.1900 64 bit (AMD64)]
Dask Version: 1.0.0
Distributed Version: 1.25.2
Tornado Version: 5.1.1
distributed.comm.tcp - WARNING - Closing dangling stream in <TCP local=tcp://127.0.0.1:55130 remote=tcp://127.0.0.1:55112>
Traceback (most recent call last):
  File "<file>", line 34, in <module>
    main()
  File "<file>", line 26, in main
    assert wr() is None
AssertionError

Process finished with exit code 1
mrocklin commented 5 years ago

You might have to close the cluster as well?

But generally I'm not surprised. I would not be surprised if there is a coroutine somewhere that would run for a while that still has a reference.

You might be able to help by tracking this down with further use of the gc module, if you felt so inclined.

On Thu, Jan 24, 2019 at 3:18 PM Michael Peleshenko notifications@github.com wrote:

For some reason when I try to close and delete a Client object that uses a LocalCluster, I still see that it exists in memory even after gc.collect(). Am I correctly cleaning up the client object?

import gcimport platformimport sysimport weakref import daskimport distributedimport tornadofrom distributed import LocalCluster, Client

def main(): cluster = LocalCluster( diagnostics_port=None, ip='127.0.0.1', n_workers=1, scheduler_port=0, threads_per_worker=1, ) client = Client( address=cluster, ) wr = weakref.ref(client) client.close() del client gc.collect() assert wr() is None

if name == 'main': print('OS:', platform.platform()) print('Python Version:', sys.version) print('Dask Version:', dask.version) print('Distributed Version:', distributed.version) print('Tornado Version:', tornado.version) main()

Output

OS: Windows-7-6.1.7601-SP1 Python Version: 3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:05:27) [MSC v.1900 64 bit (AMD64)] Dask Version: 1.0.0 Distributed Version: 1.25.2 Tornado Version: 5.1.1 distributed.comm.tcp - WARNING - Closing dangling stream in Traceback (most recent call last): File "", line 34, in main() File "", line 26, in main assert wr() is None AssertionError

Process finished with exit code 1

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/2480, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszHN91xF34xkMj2gYbkffLF7qfnv5ks5vGj84gaJpZM4aR5h6 .