dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.58k stars 718 forks source link

Include NumPy BLAS/LAPACK info in client.get_versions() #1827

Open jakirkham opened 6 years ago

jakirkham commented 6 years ago

At the risk of overloading client.get_versions() with info, it would be handy to be able to check the NumPy BLAS/LAPACK linkage in here. This can be really helpful when debugging a slow computation or a very strange segfault that might be BLAS or LAPACK related. One way at this info is numpy.__config__.show(), but that might be too heavy for client.get_versions(). Open to other ways to include this info if there are suggestions.

mrocklin commented 6 years ago

I think it's reasonable to include more things. It's fairly cheap. We might also keep get_versions as it is, but make a larger get_info function that has a wider scope

On Fri, Mar 9, 2018 at 2:14 PM, jakirkham notifications@github.com wrote:

At the risk of overloading client.get_versions() with info, it would be handy to be able to check the NumPy BLAS/LAPACK linkage in here. This can be really helpful when debugging a slow computation or a very strange segfault that might be BLAS or LAPACK related. One way at this info is numpy.config.show(), but that might be too heavy for client.get_versions(). Open to other ways to include this info if there are suggestions.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/1827, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszDqpRpSYbYb75M98Y4WC129gU6r4ks5tctSggaJpZM4SkvBu .

rbubley commented 6 years ago

Where the task is largely about gathering information from workers, I was wondering if the right approach might be to modify Client.run() to be able to return values (or futures). Deciding what information to harvest from the workers would then be in the control of the clients, not reliant on changes to distributed.

mrocklin commented 6 years ago

Yes, that's doable today from user-space and a fine solution.

One reason by get_versions doesn't take this approach (it used to) is that it also gathers information from the scheduler, where we try to avoid depending on pickle. I suspect that in the future, using pickle may be turned off by default in the scheduler.

lalitparate commented 5 years ago

Hi, I am first time contributing to open source. Can I wok on it?

jhamman commented 5 years ago

@lalitparate - yes, dask is a community driven open-source project. As such, anyone is welcome to work on anything. Let us know if you need help.

moshiba commented 4 years ago

Are we still aiming to show this worker linkage info in client.get_versions() ?

I think it's reasonable to include more things. It's fairly cheap. We might also keep get_versions as it is, but make a larger get_info function that has a wider scope

Or should I build get_info() by wrapping client.run() ?

quasiben commented 4 years ago

As others have commented, adding to get_versions seems to be a supported idea. You might want to look at https://github.com/dask/distributed/pull/3567 as it has some updates to get_versions as well as tests

moshiba commented 4 years ago

As others have commented, adding to get_versions seems to be a supported idea. You might want to look at #3567 as it has some updates to get_versions as well as tests

Sure, thanks.

May I ask what exactly do we want to show in get_versions()? Since there are lots of possible BLAS/LAPACK library linking options in Numpy, (seven currently) I'm not sure if showing every build info presented in Numpy.show_config() is the best idea.

moshiba commented 4 years ago

Another question is how should we fit the various library linkage info into client.get_versions()? It seems to me that the current output layout is not meant to present a list of sublists about a package but to show version info alone, packing stuff like this into get_versions() for every worker seems suboptimal

blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
quasiben commented 4 years ago

@HsuanTingLu do you have thoughts on a more optimal layout ?

moshiba commented 4 years ago

No, I don't have one, so I'll probably add it anywhere you guys see fit.

Back to the second question, how should the info be fitted into client.get_versions()? I'm thinking about adding a numpy-config sublist under host, or maybe somewhere under package::numpy?

quasiben commented 4 years ago

I am +1 on package::numpy. I understand this to mean something like:

 'packages': {'numpy': 'blas_opt_info: {}

Is that right ?

moshiba commented 4 years ago

Yeah something like this 'packages': { 'numpy': '1.18.2', 'blas_opt_info: {}, 'lapack_opt_info: {}}

GenevieveBuckley commented 3 years ago

@HsuanTingLu do you still want to work on this? Did the comments from Ben answer all your questions?

moshiba commented 3 years ago

@GenevieveBuckley I have a few commits lying around, I think I'll need a few weeks to put them together

GenevieveBuckley commented 3 years ago

Sounds great, thank you!