cmu-phil / py-tetrad

Makes algorithms/code in Tetrad available in Python via JPype
MIT License
59 stars 11 forks source link

Results printing stopped #7

Closed Zarmas closed 1 year ago

Zarmas commented 1 year ago

Hello, I tried running py-tetrad with a python script on a 10000 variables and 50 samples dataset. Py-tetrad starts running and printing results, but after a few hours of running, the results are no longer being printed, while the process keeps running for a few more hours before stopping. The following is the python script I used, and I am wondering if I am not running it properly.

import jpype.imports

try: jpype.startJVM(classpath=[f"resources/tetrad-gui-current-launch.jar"])

jpype.startJVM("-Xmx40g", classpath=[f"resources/tetrad-gui-current-launch.jar"])

except OSError: print("JVM already started")

from edu.cmu.tetrad.util import Params, Parameters import pandas as pd import tools.translate as tr import tools.TetradSearch as search import edu.cmu.tetrad.search as ts

df = pd.read_csv("resources/dataset.txt", sep="\t") data = tr.pandas_data_to_tetrad(df) params = Parameters()

score = ts.SemBicScore(data)

score.setPenaltyDiscount(10.0) score.setStructurePrior(0)

fges_graph = ts.Fges(score) fges_graph.setMaxDegree(3) fges_graph.setParallelized(True) fges_graph.setVerbose(False)

print('FGES', fges_graph.search())

jdramsey commented 1 year ago

I don't suppose you could share your data with me (by email not publically). I'd love to give it a shot myself and watch it. I did run a 20,000 variable example before, and it finished, so I'm curious what the difference might be.

jdramsey commented 1 year ago

It occurred to me that one thing you could try (which I was going to try) would be to turn parallelization off.

bja43 commented 1 year ago

You say: "Py-tetrad starts running and printing results, but after a few hours of running, the results are no longer being printed, while the process keeps running for a few more hours before stopping."

Is the program still running after printing the graph from : print('FGES', fges_graph.search()) or is it that algorithm progress updates stop printing for a few hours before that final print is called?

Zarmas commented 1 year ago

I am sorry for not explaining it properly, the printing stops during the verbose printing part. If I set verbose as false, no results are returned as well. Tried setting parallelized as false, but now it stops printing earlier.

bja43 commented 1 year ago

Odd, the only thing I can thing of is maybe its caught up collecting garbage? Hey @jdramsey, do you know how to specify memory allocation for jpype?

jdramsey commented 1 year ago

Oh hold on, my dyslexia may have gotten the better of me here. Your code says:

jpype.startJVM(classpath=[f"resources/tetrad-gui-current-launch.jar"])

jpype.startJVM("-Xmx40g",

classpath=[f"resources/tetrad-gui-current-launch.jar"])

If you reverse the commenting, it should allow you to use more RAM.:

jpype.startJVM(classpath=[f"resources/tetrad-gui-current-launch.jar"])

jpype.startJVM("-Xmx40g", classpath=[f"resources/tetrad-gui-current-launch.jar"])

I have a 64G Mac, so 40G is fine for me, but you'd need to put a number in that your machine can support.

On Thu, Jul 6, 2023 at 10:42 PM Bryan Andrews @.***> wrote:

Odd, the only thing I can thing of is maybe its caught up collecting garbage? Hey @jdramsey https://github.com/jdramsey, do you know how to specify memory allocation for jpype?

— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/py-tetrad/issues/7#issuecomment-1624568478, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLFSRZIH4K2PQNHZQYWOILXO5ZPNANCNFSM6AAAAAAZ4CHKL4 . You are receiving this because you were mentioned.Message ID: @.***>

jdramsey commented 1 year ago

@Zarmas Is this fixed for you? Are you able to close it? I'm thinking of doing some more experimentation with large variable sets this morning; I can let you know how it goes.

Zarmas commented 1 year ago

I am running tests by reversing the commenting as you suggested, and by checking the verbose output, it reaches close to a 0 score edge addition, starts adding more edges with higher score, but it runs too slow, adding one edge per 2 hours on a 14thread and 120GB RAM system, and a 10000 variables and 50 samples dataset. Also comparing the verbose outputs to the ones before reversing the commenting, it seems that the output printing stopped after it reached the 0 score edge addition I mentioned earlier. But after reversing the commenting it seems to be running and continues printing the verbose output normally, just slow.

jdramsey commented 1 year ago

Interesting. So that was some progress, at least. You could raise the penalty discount even higher. I was doing a 20,000 variable dataset with 50 samples for someone else, and I raised the penalty discount to 30, and it came back very quickly with 7 edges (way too sparse). The point is, if you can get it to come back quickly, you can lower the penalty discount until it returns at a reasonable time.

I just wrote out a script in R to run analyze 20,000 variables with N ~ 50; I set a couple of parameters on FGES to speed up the process; give me a few minutes and I'll translate this into Python. Setting "verbose" to true allows you to watch the progress. If it's too slow, you can stop it and rerun with a higher penalty.

setwd("~/py-tetrad/pytetrad") library(reticulate)

data <- read.table("data.txt", header=TRUE)

source_python("tools/TetradSearch.py")

ts <- TetradSearch(data) ts$use_sem_bic(penalty_discount=12) ts$set_verbose(TRUE)

ts$run_fges(faithfulness_assumed = TRUE, parallelized = TRUE)

print(ts$get_string())

jdramsey commented 1 year ago

I just spent several days fixing Tetrad to help it analyze a problem with 20,000 variables and N = 50. Let me know if you're interested in the results. :-)

Zarmas commented 1 year ago

I am interested.

jdramsey commented 1 year ago

Almost there. We defined a new Restricted BOSS algorithm that runs BOSS (a new permutation algorithm) on a restricted subset of the data. We're testing it now on some data. I'll let you know.

Zarmas commented 1 year ago

I updated py-tetrad 9 days ago and now I am getting a new error message, which I am not getting when running an older version of py-tetrad with the same parameters on the same dataset. It also appeared to have stopped running sooner than the old version, it stopped running at about half the edge additions after getting the following error message multiple times but with different variables.

java.util.concurrent.ExecutionException: java.lang.RuntimeException: Singularity encountered when scoring variable1 | variable2variable3 at java.base/java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1006) at edu.cmu.tetrad.search.Fges.calculateArrowsForward(Fges.java:684) at edu.cmu.tetrad.search.Fges.access$500(Fges.java:86) at edu.cmu.tetrad.search.Fges$1AdjTask.call(Fges.java:559) at edu.cmu.tetrad.search.Fges$1AdjTask.call(Fges.java:507) at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1448) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) Caused by: java.lang.RuntimeException: Singularity encountered when scoring variable1 | variable2variable3 at edu.cmu.tetrad.search.score.SemBicScore.localScore(SemBicScore.java:290) at edu.cmu.tetrad.search.score.SemBicScore.localScoreDiff(SemBicScore.java:248) at edu.cmu.tetrad.search.Fges.scoreGraphChange(Fges.java:1045) at edu.cmu.tetrad.search.Fges.insertEval(Fges.java:742) at edu.cmu.tetrad.search.Fges.access$600(Fges.java:86) at edu.cmu.tetrad.search.Fges$1EvalTask.call(Fges.java:645) at edu.cmu.tetrad.search.Fges$1EvalTask.call(Fges.java:626) at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1448) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) at java.base/java.util.concurrent.ForkJoinTask.doJoin(ForkJoinTask.java:396) at java.base/java.util.concurrent.ForkJoinTask.quietlyJoin(ForkJoinTask.java:1078) at java.base/java.util.concurrent.ForkJoinPool.invokeAll(ForkJoinPool.java:2519) at edu.cmu.tetrad.search.Fges.calculateArrowsForward(Fges.java:680) ... 9 more

jdramsey commented 1 year ago

@Zarmas Oh, I know the answer to this one! I'll fix it! What happened was, we thought it would be a good idea to let the user know of the existence of singularities in their data, so I added exceptions that would tell them this. But what I should have done was return NaN in these cases and just printed the singularities to the console.

I will go through and fix this before the next version is released (soon).

jdramsey commented 1 year ago

@Zarmas By the way, the new method I was telling you about--we don't have an actual result to report yet, but it looks promising for the 50 x 20000 case. It works well in simulation; we were just hoping to have an analysis finished for a real dataset for which we could give the results, but that may have to wait a few months. (It's nearly there.)

jdramsey commented 1 year ago

I fixed the singularity issue in my branch. I'll try to move that fix over to py-tetrad soon.

jdramsey commented 1 year ago

@Zarmas I just updated the Tetrad jar in py-tetrad to fix the singularity issue for you. If you update py-tetrad you should get the fix.

jdramsey commented 1 year ago

HI @Zarmas, Let me know fi you still have more issues. I'm focusing on fixing issues this week and the next, October now to October 15.

jdramsey commented 1 year ago

Hi @Zarmas Let me know when you have that new analysis ready to look--I'm going to close this issues for bookkeeping but you can open a new one anytime or email me.