Closed jrbourbeau closed 9 months ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Note the major change here is I'm currently only processing the first 1000 directories at the moment. This takes ~3.5 minutes with Coiled Functions scaling up to 100 workers (the default adaptive limit). Doing the full 6000+ directories is probably too slow for this example. Some options would be to allow ourselves to adaptive scale to more VMs or to use the new threads_per_worker
kwarg
Some options would be to allow ourselves to adaptive scale to more VMs or to use the new threads_per_worker kwarg
I'd go with one of those options. IMO example becomes a lot less compelling if we restrict its size to what we could easily handle. "Sorry too big" isn't really the look I want for Coiled.
Let's change the default adaptive maximum for functions.
sounds good
What do people think about also changing the example so that workload runs on single-core ARM workers? That's the best way I've found to run this.
(If we do this, I propose we also tweak something so that scheduler is not single-core machine in this case.)
Sounds good -- I've found
arm=True
cpu=1
spot_policy="spot_with_fallback"
to be good setting for churning through lots of small files
I'm going to merge this in -- happy to follow-up if folks have additional comments though
Closes https://github.com/coiled/examples/issues/33