dask / community

For general discussion and community planning. Discussion issues welcome.
19 stars 3 forks source link

Dask Demo Day 2024-01-18 #359

Closed jrbourbeau closed 5 months ago

jrbourbeau commented 6 months ago

When

Thursday, January 18, at 10am US CT (meeting invite below and also on the Dask calendar)

Agenda

Meeting Invite

Join Zoom Meeting https://us06web.zoom.us/j/89383035703?pwd=WkRJSzNnRTh4T2R1ZjJuVVdJWlMxQT09

jrbourbeau commented 6 months ago

@rjzamora in last month's maintainers meeting you mentioned talking about TPC-H on GPUs -- does the January demo day work for you?

jrbourbeau commented 5 months ago

@cisaacstern @jacobtomlinson just wanted to check in to see if you're still good to go for tomorrow

@rjzamora could you talk for a handful of minutes about TPC-H on GPUs?

cisaacstern commented 5 months ago

@jrbourbeau confirmed for me!

@jacobtomlinson I will finish our presentation notebook today 😊

rjzamora commented 5 months ago

could you talk for a handful of minutes about TPC-H on GPUs?

I can try to pull something together, but I'd rather not present this month since I would just be re-sharing old plots from the webinar... Dask-expr no longer works with cudf well enough to run TPC-h, so benchmarking has been put on the back burner for now.

jrbourbeau commented 5 months ago

Great, thanks @cisaacstern

@rjzamora no worries, just let me know when a good time would be

@mrocklin said offline that he can talk about some recent dask-expr work for backing Dask Array with the new expression system

@scharlottej13 also wanted to talk about some recent work she did on the one billion row challenge with Dask

@phofl do you have bandwidth to talk a little about the recent Dask DataFrame work in dask-expr? For example, what does the migration plan for including in mainline dask look like? How is it going right now? Maybe a computation that we couldn't do a month ago due to lack of API coverage, but can now with all the recent updates?

phofl commented 5 months ago

I don't really have bandwidth to prepare anything

On Wed, Jan 17, 2024 at 11:09 PM James Bourbeau @.***> wrote:

Great, thanks @cisaacstern https://github.com/cisaacstern

@rjzamora https://github.com/rjzamora no worries, just let me know when a good time would be

@mrocklin https://github.com/mrocklin said offline that he can talk about some recent dask-expr work for backing Dask Array with the new expression system

@scharlottej13 https://github.com/scharlottej13 also wanted to talk about some recent work she did on the one billion row challenge https://www.reddit.com/r/Python/comments/18zi0o5/one_billion_row_challenge/ with Dask

@phofl https://github.com/phofl do you have bandwidth to talk a little about the recent Dask DataFrame work in dask-expr? For example, what does the migration plan for including in mainline dask look like? How is it going right now? Maybe a computation that we couldn't do a month ago due to lack of API coverage, but can now with all the recent updates?

— Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/359#issuecomment-1897349820, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOYQZGAPCE36GGOAJPMQBSLYPBK2FAVCNFSM6AAAAABA6VHTV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJXGM2DSOBSGA . You are receiving this because you were mentioned.Message ID: @.***>

jacobtomlinson commented 5 months ago

@cisaacstern that sounds great. I think this is definitely going to be your demo, but I'm happy to help how I can.

@scharlottej13 I really enjoyed your blog post on the billion row challenge. I also had a play around with this code using RAPIDS too and we managed to do it in 18s on a 3070 gaming GPU using Dask + cudf.

scharlottej13 commented 5 months ago

Recording is up on youtube: https://youtu.be/wkQzVNQdgW0. Thanks everyone!

mrocklin commented 5 months ago

Thoughts on starting to put these on Reddit as they happen?

On Thu, Jan 18, 2024, 2:48 PM Sarah Charlotte Johnson < @.***> wrote:

Recording is up on youtube: https://youtu.be/wkQzVNQdgW0. Thanks everyone!

— Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/359#issuecomment-1899182044, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTHDOJQPDP2CMQCOLADYPGDDJAVCNFSM6AAAAABA6VHTV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJZGE4DEMBUGQ . You are receiving this because you were mentioned.Message ID: @.***>

scharlottej13 commented 5 months ago

Thoughts on starting to put these on Reddit as they happen?

Sure! Maybe r/Python w/ the intermediate showcase flare? r/bigdata might work too. I can always post in a few spots and see what happens.

mrocklin commented 5 months ago

I can always post in a few spots and see what happens

+1

On Thu, Jan 18, 2024 at 5:05 PM Sarah Charlotte Johnson < @.***> wrote:

Thoughts on starting to put these on Reddit as they happen?

Sure! Maybe r/Python w/ the intermediate showcase flare? r/bigdata https://www.reddit.com/r/bigdata/new/ might work too. I can always post in a few spots and see what happens.

— Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/359#issuecomment-1899358352, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTE4FQBS2ZYGZMCJ7K3YPGTEPAVCNFSM6AAAAABA6VHTV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJZGM2TQMZVGI . You are receiving this because you were mentioned.Message ID: @.***>

scharlottej13 commented 5 months ago

Cool, posted (successfully) on r/bigdata, r/distributedcomputing, and r/python.

jrbourbeau commented 5 months ago

Thanks all -- see you next month!

Also, here's the notebook @cisaacstern presented https://github.com/cisaacstern/beam-dask-demo/blob/first-pass/demo.ipynb for those who are interested