LAION-AI / Big-Interleaved-Dataset

Big-Interleaved-Dataset
Apache License 2.0
58 stars 8 forks source link

Multi Node support #8

Closed harry-stark closed 1 year ago

harry-stark commented 2 years ago

Starting point: https://gist.github.com/rom1504/67ada3dedbecc113ae2dbdfd9c642d83

harry-stark commented 2 years ago

https://github.com/LLNL/magpie

harry-stark commented 1 year ago

Closed; Have used the above gist for setting up the spark cluster and used spark session builder code from rom cc2dataset