Closed jrbourbeau closed 5 months ago
@rjzamora in last month's maintainers meeting you mentioned talking about TPC-H on GPUs -- does the January demo day work for you?
@cisaacstern @jacobtomlinson just wanted to check in to see if you're still good to go for tomorrow
@rjzamora could you talk for a handful of minutes about TPC-H on GPUs?
@jrbourbeau confirmed for me!
@jacobtomlinson I will finish our presentation notebook today 😊
could you talk for a handful of minutes about TPC-H on GPUs?
I can try to pull something together, but I'd rather not present this month since I would just be re-sharing old plots from the webinar... Dask-expr no longer works with cudf well enough to run TPC-h, so benchmarking has been put on the back burner for now.
Great, thanks @cisaacstern
@rjzamora no worries, just let me know when a good time would be
@mrocklin said offline that he can talk about some recent dask-expr
work for backing Dask Array with the new expression system
@scharlottej13 also wanted to talk about some recent work she did on the one billion row challenge with Dask
@phofl do you have bandwidth to talk a little about the recent Dask DataFrame work in dask-expr
? For example, what does the migration plan for including in mainline dask
look like? How is it going right now? Maybe a computation that we couldn't do a month ago due to lack of API coverage, but can now with all the recent updates?
I don't really have bandwidth to prepare anything
On Wed, Jan 17, 2024 at 11:09 PM James Bourbeau @.***> wrote:
Great, thanks @cisaacstern https://github.com/cisaacstern
@rjzamora https://github.com/rjzamora no worries, just let me know when a good time would be
@mrocklin https://github.com/mrocklin said offline that he can talk about some recent dask-expr work for backing Dask Array with the new expression system
@scharlottej13 https://github.com/scharlottej13 also wanted to talk about some recent work she did on the one billion row challenge https://www.reddit.com/r/Python/comments/18zi0o5/one_billion_row_challenge/ with Dask
@phofl https://github.com/phofl do you have bandwidth to talk a little about the recent Dask DataFrame work in dask-expr? For example, what does the migration plan for including in mainline dask look like? How is it going right now? Maybe a computation that we couldn't do a month ago due to lack of API coverage, but can now with all the recent updates?
— Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/359#issuecomment-1897349820, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOYQZGAPCE36GGOAJPMQBSLYPBK2FAVCNFSM6AAAAABA6VHTV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJXGM2DSOBSGA . You are receiving this because you were mentioned.Message ID: @.***>
@cisaacstern that sounds great. I think this is definitely going to be your demo, but I'm happy to help how I can.
@scharlottej13 I really enjoyed your blog post on the billion row challenge. I also had a play around with this code using RAPIDS too and we managed to do it in 18s on a 3070 gaming GPU using Dask + cudf.
Recording is up on youtube: https://youtu.be/wkQzVNQdgW0. Thanks everyone!
Thoughts on starting to put these on Reddit as they happen?
On Thu, Jan 18, 2024, 2:48 PM Sarah Charlotte Johnson < @.***> wrote:
Recording is up on youtube: https://youtu.be/wkQzVNQdgW0. Thanks everyone!
— Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/359#issuecomment-1899182044, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTHDOJQPDP2CMQCOLADYPGDDJAVCNFSM6AAAAABA6VHTV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJZGE4DEMBUGQ . You are receiving this because you were mentioned.Message ID: @.***>
Thoughts on starting to put these on Reddit as they happen?
Sure! Maybe r/Python w/ the intermediate showcase flare? r/bigdata might work too. I can always post in a few spots and see what happens.
I can always post in a few spots and see what happens
+1
On Thu, Jan 18, 2024 at 5:05 PM Sarah Charlotte Johnson < @.***> wrote:
Thoughts on starting to put these on Reddit as they happen?
Sure! Maybe r/Python w/ the intermediate showcase flare? r/bigdata https://www.reddit.com/r/bigdata/new/ might work too. I can always post in a few spots and see what happens.
— Reply to this email directly, view it on GitHub https://github.com/dask/community/issues/359#issuecomment-1899358352, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTE4FQBS2ZYGZMCJ7K3YPGTEPAVCNFSM6AAAAABA6VHTV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJZGM2TQMZVGI . You are receiving this because you were mentioned.Message ID: @.***>
Cool, posted (successfully) on r/bigdata, r/distributedcomputing, and r/python.
Thanks all -- see you next month!
Also, here's the notebook @cisaacstern presented https://github.com/cisaacstern/beam-dask-demo/blob/first-pass/demo.ipynb for those who are interested
When
Thursday, January 18, at 10am US CT (meeting invite below and also on the Dask calendar)
Agenda
DaskRunner
Meeting Invite
Join Zoom Meeting https://us06web.zoom.us/j/89383035703?pwd=WkRJSzNnRTh4T2R1ZjJuVVdJWlMxQT09