hpc-carpentry / hpc-parallel-novice

Introductory material on parallelization using python with a focus on HPC platforms
https://hpc-carpentry.github.io/hpc-parallel-novice
Other
2 stars 5 forks source link

Update Dask Example #21

Open bkmgit opened 3 years ago

bkmgit commented 3 years ago

cluster = SLURMCluster(cores=4, processes=1, memory="4GB", walltime="00:10:00")

np.random.seed(2021) da.random.seed(2021)

def inside_circle(total_count, chunk_size=-1): x = da.random.uniform(size=(total_count), chunks=(chunk_size)) y = da.random.uniform(size=(total_count), chunks=(chunk_size)) radii = da.sqrt (xx + yy) filtered = da.where(radii <= 1.0) indices = np.array(filtered[0]) count = len(radii[indices]) return count

def estimate_pi(total_count,chunk_size): count = inside_circle(total_count, chunk_size) return (4.0 * count / total_count )

def main(): parser = argparse.ArgumentParser( description='Estimate Pi using a Monte Carlo method.') parser.add_argument('n_samples', metavar='N', type=int, nargs=1, default=10000, help='number of times to draw a random number') parser.add_argument('chunk_size', metavar='N', type=int, nargs=1, default=1000, help='chunk size') args = parser.parse_args()

n_samples = args.n_samples[0] chunk_size = args.chunk_size[0] client = Client(cluster) my_pi = estimate_pi(n_samples,chunk_size)

print("[dask version] pi is %f from %i samples with %i" % (my_pi, n_samples,chunk_size)) sys.exit(0)

if name=='main': main()


- It may also be worth considering [Ray](https://docs.ray.io/en/master/cluster/slurm.html)
psteinb commented 3 years ago

It may not be realistic to assume that most clusters will allow setting up of a webserver for viewing the scheduler.

ocaisa commented 3 years ago

I have used dask-jobqueue a lot and have organised some tutorials on it. To me, it is great way to introduce interactive supercomputing. It really is best used through JupyterHub though, where you have really nice visualisations. This can also be made to work well with remote systems. There are great lessons out there in this respect, but of course they use Jupyter notebooks not a Carpentries template, for example see https://github.com/ExaESM-WP4/workshop-Dask-Jobqueue-cecam-2021-02

ocaisa commented 3 years ago

There are solutions for that which could still allow us to stick (mostly) to the Carpentries template, https://jekyllnb.readthedocs.io/en/latest/ used within a GitHub Action could do this.