Closed zoey-rw closed 7 months ago
Hi, I'm sorry you are experiencing this, it sure sounds frustrating. The multiprocessing module in Python can sometimes be a bit unstable. We sometimes see similar problems and they are often related to individuals processes generating a lot of log output that sometimes makes the thread pool unresponsive. The following might help (I would try those in that error):
import multiprocessing
if name == "main": multiprocessing.set_start_method("spawn")
3. Run some of the later samples with threads=1 to see error messages that have been missed before.
For RAM you should plan with 1-2GB per thread, so in your case I would start by allocating 56GB and then reduce that based on actual use later.
Hopefully one of those will help.
Thanks for these solutions - the first 2 seemed to help for the 16 taxa/100 samples situation, but the build process is still crashing when I scale up to ~50 taxa/600 samples.
Even with the Jupyter notebook flag set to "--NotebookApp.iopub_msg_rate_limit=1.0e10", or running from command line, the logging messages are still too frequent to see any progress bar updates. This is the type of warning that seems to be generating most of the log output:
WARNING Reaction UF03564_E__pseudogymnoascus seems to be an exchange reaction but its ID does not start with 'EX_'... community.py:323
Is there a way to safely turn off logging for that warning, rather than changing all the reaction IDs? One of the following commands suppresses them when the parallel method is not "spawn", but the warnings show up anyways if multiprocessing.set_start_method("spawn", force=True)
import logging
logging.getLogger("micom.Community").setLevel(logging.ERROR)
logging.getLogger("micom").setLevel(logging.ERROR)
logging.getLogger("micom.logger").setLevel(logging.ERROR)
Nesting underneath the multiprocessing call doesn't suppress the warnings either:
if __name__ == '__main__':
multiprocessing.set_start_method("spawn", force=True)
logging.getLogger("micom.Community").setLevel(logging.ERROR)
logging.getLogger("micom").setLevel(logging.ERROR)
logging.getLogger("micom.logger").setLevel(logging.ERROR)
manifest = build(...)
The following approach worked for suppressing FutureWarnings from pandas, but I'm not sure if it can apply to the micom warnings:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
import pandas as pd
Okay, I think this works for reducing the logging while preserving the multiprocessing:
import logging
logging.getLogger("micom").setLevel(logging.ERROR)
logging.getLogger("micom.logger").setLevel(logging.ERROR)
if __name__ == '__main__':
multiprocessing.set_start_method("spawn", force=True)
logger = multiprocessing.log_to_stderr()
logger.setLevel(logging.ERROR)
manifest = build(...)
I will mark this as closed!
Nice, thanks for investigating. Spawn will become the default method soon. I tested it and there is no notable performance hit. For the logging it should usually work with something like:
from micom.logger import logger
import logging
logger.setLevel(logging.ERROR)
All other modules just import that one logger. But I think this particular warning can be converted to a DEBUG message. The exchange reaction inference in cobrapy is sophisticated enough by now that this is probably not that serious. I will line that up for the next release.
Problem description
After creating a custom database, the "build" command sometimes stalls at 99% when creating a manifest from 100+ samples and 16 taxa. If I open a new Python session and run the same build command, it occasionally creates the manifest successfully. I'm also encountering stalling at 99% complete with the fix_medium command (same dataset).
I'm using 28 cores so I don't think it's a lack of RAM, but would appreciate any ideas for debugging!
Code Sample
Context
Other session info:
The beginning of the tax file looks like this, with 1410 rows total: