ekirving / qpbrute

Heuristic search algorithm for fitting qpGraph models
MIT License
9 stars 3 forks source link

IOError not caught properly? #13

Closed DinRigtigeFar closed 3 years ago

DinRigtigeFar commented 3 years ago

Hi Evan I'm getting the following error when running my samples:

INFO: There are 5,040 possible starting orders for the given nodes.
INFO: Performing an exhaustive search.
INFO: Starting list ['Ghana', 'Tanzania', 'Zimbabwe', 'Zambia', 'Namibia', 'Desert', 'RedRiverHog']
(DomesticPig,(Ghana,Tanzania));                                                         nodes=3  admix=0         outliers=0      worst=-0.445    7e0368d02a75
  (DomesticPig,(Tanzania,(Ghana,Zimbabwe)));                                            nodes=4  admix=0         outliers=18     worst=-247.515  080fc282e70e
  (DomesticPig,(Ghana,(Tanzania,Zimbabwe)));                                            nodes=4  admix=0         outliers=1      worst=-4.961    28f1efced581
  (DomesticPig,(Zimbabwe,(Ghana,Tanzania)));                                            nodes=4  admix=0         outliers=18     worst=247.515   843a591cd903
  (DomesticPig,(a1,(Tanzania,(Ghana,(Zimbabwe)a1))));                                   nodes=4  admix=1         outliers=18     worst=-247.515  33e224b45aae
  (DomesticPig,((Ghana,(Zimbabwe)a1),(Tanzania,a1)));                                   nodes=4  admix=1         outliers=1      worst=4.967     1a51f14106bd
  (DomesticPig,(a1,(Ghana,(Tanzania,(Zimbabwe)a1))));                                   nodes=4  admix=1         outliers=0      worst=-0.198    511d7d31c9be
    (DomesticPig,(a1,(Ghana,((Zimbabwe)a1,(Tanzania,Zambia)))));                        nodes=5  admix=1         outliers=36     worst=-62.213   4c4f946df8d6
    (DomesticPig,(a1,(Ghana,(Tanzania,(Zambia,(Zimbabwe)a1)))));                        nodes=5  admix=1         outliers=1      worst=-3.435    b305e1937dbe
    (DomesticPig,(a1,(Ghana,(Zambia,(Tanzania,(Zimbabwe)a1)))));                        nodes=5  admix=1         outliers=36     worst=62.140    c9c283922ad3
    (DomesticPig,(Zambia,(a1,(Ghana,(Tanzania,(Zimbabwe)a1)))));                        nodes=5  admix=1         outliers=44     worst=280.096   4e727820a3fd
    (DomesticPig,((Zambia,a1),(Ghana,(Tanzania,(Zimbabwe)a1))));                        nodes=5  admix=1         outliers=52     worst=253.206   f41ae2a28245
    (DomesticPig,(a1,(Zambia,(Ghana,(Tanzania,(Zimbabwe)a1)))));                        nodes=5  admix=1         outliers=44     worst=280.096   682cf9ee0c20
    (DomesticPig,(a1,(Ghana,(Tanzania,(Zambia,Zimbabwe)a1))));                          nodes=5  admix=1         outliers=25     worst=-17.454   63baa586040e
    (DomesticPig,(a1,((Tanzania,(Zimbabwe)a1),(Ghana,Zambia))));                        nodes=5  admix=1         outliers=44     worst=280.096   78db4267265f
    (DomesticPig,(a1,(Ghana,((Zimbabwe,a2)a1,(Tanzania,(Zambia)a2)))));                 nodes=5  admix=2         outliers=0      worst=-0.178    805c719a971a
    (DomesticPig,(a1,((Ghana,(Zambia)a2),(Tanzania,((Zimbabwe)a1,a2)))));               nodes=5  admix=2         outliers=1      worst=-3.443    ce1f2143834d
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 318, in run_qpgraph
    with open(log_file, "r") as fin:
FileNotFoundError: [Errno 2] No such file or directory: 'allWart/graphs/allWart-8cace67687b1.log'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/multiprocess/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/pathos/helpers/mp_helper.py", line 15, in <lambda>
    func = lambda args: f(*args)
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 335, in run_qpgraph
    log = run_cmd(
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/utils.py", line 63, in run_cmd
    raise RuntimeError(f"ERROR: '{err}'; RETCODE:{proc.returncode}\n" + " ".join(cmd))
RuntimeError: ERROR: 'b'fatalx:\nUnable to allocate 1 unit(s) for item \n''; RETCODE:-6
qpGraph -p ParFile_allWarts -g allWart/graphs/allWart-8cace67687b1.graph -d allWart/graphs/allWart-8cace67687b1.dot
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/jfl323/miniconda3/envs/qpbrute/bin/qpBrute", line 8, in <module>
    sys.exit(qpbrute())
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 892, in qpbrute
    permute_qpgraph(
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 808, in permute_qpgraph
    pq.find_graph()
File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 718, in find_graph
    self.recurse_tree(root_tree, self.nodes[1], self.nodes[2:])
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 123, in recurse_tree
    node_placed = self.check_results(results, remaining, depth)
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 235, in check_results
    self.recurse_tree(new_tree, remaining[0], remaining[1:], depth + 1)
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 174, in recurse_tree
    node_placed = self.check_results(results, remaining, depth)
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 235, in check_results
    self.recurse_tree(new_tree, remaining[0], remaining[1:], depth + 1)
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 171, in recurse_tree
    results = self.test_trees(admix_trees, depth)
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/qpbrute/qpbrute.py", line 206, in test_trees
    results = pool.map(
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/pathos/multiprocessing.py", line 137, in map
    return _pool.map(star(f), zip(*args)) # chunksize
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/multiprocess/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/jfl323/miniconda3/envs/qpbrute/lib/python3.8/site-packages/multiprocess/pool.py", line 771, in get
    raise self._value
RuntimeError: ERROR: 'b'fatalx:\nUnable to allocate 1 unit(s) for item \n''; RETCODE:-6  
qpGraph -p ParFile_allWarts -g allWart/graphs/allWart-8cace67687b1.graph -d allWart/graphs/allWart-8cace67687b1.dot

I've installed qpBrute using conda as you recommend: conda env create --name qpbrute --file https://raw.githubusercontent.com/ekirving/qpbrute/master/environment.yaml

ekirving commented 3 years ago

Thanks for reporting this error. From the stack trace I can see that the error is coming from inside of qpGraph itself, and that the earlier FileNotFoundError exception is simply used for control-flow inside qpBrute.

I've never see this specific error before, but the message seems to indicate that it's a memory exception. Is it possible that you have run out of RAM during the run? If you have a large dataset, then running many models in parallel can result in a high memory load.

What happens if you run the problematic command on its own?

qpGraph -p ParFile_allWarts -g allWart/graphs/allWart-8cace67687b1.graph -d allWart/graphs/allWart-8cace67687b1.dot

If it does turn out to be a memory issue, I would recommend you run a single model to determine the peak memory usage. Once you know how much RAM a single model requires you can divide this into your available RAM and use the --threads parameter to limit the number of concurrent models.

DinRigtigeFar commented 3 years ago

Thanks for the quick reply. It does seem to be a memory problem. I'll limit the threads and run your awesome program again. Thanks!

g-fabbri commented 4 months ago

@DinRigtigeFar how did you figure out it was a memory problem? I got the same error as you, I ran the qpGraph command that for me was problematic in the log file with the same resources I asked when running qpBrute and it all went smoothly... I don't understand if running this single qpGraph command corresponds to what Evan suggested when he said: "run a single model to determine the peak memory usage."