Closed alexkrohn closed 3 years ago
Did you try reducing the number of cores? This will reduce the number of running jobs and by extension increase the amount of RAM available to each job. Try cutting the number of cores in half. It'll run longer but at least it won't crash. There's not really a way to limit memory usage for any given job, so limiting the number of concurrent jobs is the only way.
I didn't see that flag/option in the cookbook https://ipyrad.readthedocs.io/en/master/API-analysis/cookbook-tetrad.html, so I wasn't sure it existed. What is the command? I'm running this interactively in Jupyter Notebook. Thanks!
On Thu, Sep 16, 2021 at 9:16 AM Isaac Overcast @.***> wrote:
Did you try reducing the number of cores? This will reduce the number of running jobs and by extension increase the amount of RAM available to each job. Try cutting the number of cores in half. It'll run longer but at least it won't crash. There's not really a way to limit memory usage for any given job, so limiting the number of concurrent jobs is the only way.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dereneaton/ipyrad/issues/461#issuecomment-920893231, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADAEIHCC5DGE42TZJZGCAOTUCHU2DANCNFSM5EEXIQPA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Yeah, if you follow the notebook and use the tet.run(auto=True)
this will auto-launch an ipcluster instance and use all cores by default. If you don't want this or if you want to control the number of cores you need to do something like this:
On the command line, launch an ipcluster instance with 40 cores:
ipcluster start -n 40 --cluster-id=tetrad --daemonize
In the notebook tell tetrad to use your externally launched ipcluster:
import ippyparallel as ipp
ipyclient = ipp.Client(cluster_id="tetrad")
tet.run(ipyclient=ipyclient)
Thanks for that. It's running now -- let's see if it runs out of memory again.
Awesome! It seems to be running. Now, at 8 hours per bootstrap, and 100 bootstraps, it will only be 31 days until it has analyzed this 30 million quartet subset of my dataset 😅 I'll go ahead and close this now.
Is there a way to limit memory usage for tetrad? When I try to run tetrad on large numbers of quartets (50e6 of 74e6 total), I get this error after ~8 hours into estimating the full tree (with an average of 3645 quartets per tree). I'm running this on a server with 125 GB of RAM (SSE4.2) and 80 cores running Ubuntu 16.04.
Thanks,
Alex
Error: