bd2kccd / causal-cmd

16 stars 8 forks source link

"json-graph" option makes memory overflow after normal search ended at grasp #89

Closed yasu-sh closed 1 year ago

yasu-sh commented 1 year ago

I feel the great progress on the grasp algorithm from 1.5.0 based on TETRAD 7.2.0 since apatche's common FastMath library. In my case the 50 times shorter at the hailfinder open dataset. (500secs to ca. 10secs as below)

By the way, json-graph output looks faling at the 90% probability even the memory is not cosumed at the search stage. Could you check if the json-output would work well?

The dataset is the same I attached at the other issue the other day. But I attached again for your time.

PS E:\Tetrad> java -Xmx25G -jar "causal-cmd-1.6.1-jar-with-dependencies.jar" --data-type discrete --delimiter comma --score bdeu-score --algorithm grasp --dataset dt.hailfinder.csv --test g-square-test --verbose --json-graph
Initial knowledge sort order = [VISCloudCov, AMInsWliScen, MorningCIN, SubjVertMo, CapInScen, LLIW, AMInstabMt, AreaMesoALS, (cut)
SatContMoist, N34StarFcst, TempDis, WindAloft, Dewpoints, QGVertMotion, MorningBound, MeanRH, LatestCIN]
new thread
Edges: 101      |        Score Improvement: 10.562570   |        Tucks Performed: [[AMDewptCalPl, AMInsWliScen], [AreaMoDryAir, CombVerMo], [CurPropConv, LLIW]] [MorningCIN, CapInScen]
Edges: 102      |        Score Improvement: 78.000100   |        Tucks Performed: [] [WindFieldPln, Date]
(cut)
Edges: 69       |        Score Improvement: 4.727821    |        Tucks Performed: [] [ScenRelAMIns, ScnRelPlFcst]
Edges: 69       |        Score Improvement: 0.000000    |        Tucks Performed: [[ScenRel34, ScnRelPlFcst], [LowLLapse, Scenario], [InsChange, CompPlFcst]] [Scenario, RHRatio]
# Edges = 69 Score = -988112.904869999 (GRaSP) Elapsed 11.015 s
Final order = [AMDewptCalPl, RHRatio, Scenario, LowLLapse, ScnRelPlFcst, ScenRel34, ScenRelAMIns, AMInsWliScen, LIfr12ZDENSd, WindFieldPln, MidLLapse, ScenRelAMCIN, MorningCIN, AMCINInScen, N07muVerMo, SubjVertMo, QGVertMotion, CombVerMo, AreaMesoALS, RaoContMoist, SatContMoist, CombMoisture, AreaMoDryAir, IRCloudCover, VISCloudCov, CombClouds, CldShadeOth, (cut)
Date, SynForcng, MvmtFeatures, WindFieldMt, Dewpoints, WindAloft, TempDis, MeanRH]
Elapsed time = 12.578 s
Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
        at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:172)
        at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:538)
        at java.base/java.lang.StringBuffer.append(StringBuffer.java:317)
        at java.base/java.io.StringWriter.write(StringWriter.java:106)
        at com.google.gson.stream.JsonWriter.newline(JsonWriter.java:654)
        at com.google.gson.stream.JsonWriter.beforeName(JsonWriter.java:669)
        at com.google.gson.stream.JsonWriter.writeDeferredName(JsonWriter.java:401)
        at com.google.gson.stream.JsonWriter.value(JsonWriter.java:537)
        at com.google.gson.internal.bind.TypeAdapters$7.write(TypeAdapters.java:259)
        at com.google.gson.internal.bind.TypeAdapters$7.write(TypeAdapters.java:241)
        at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.write(TypeAdapterRuntimeTypeWrapper.java:70)
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.write(ReflectiveTypeAdapterFactory.java:196)
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.write(ReflectiveTypeAdapterFactory.java:368)
        at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.write(TypeAdapterRuntimeTypeWrapper.java:70)
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.write(ReflectiveTypeAdapterFactory.java:196)
   (intermediate loop stacks are omitted)
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.write(ReflectiveTypeAdapterFactory.java:368)
        at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.write(TypeAdapterRuntimeTypeWrapper.java:70)

PS E:\Tetrad> java -version openjdk version "11.0.18" 2023-01-17 OpenJDK Runtime Environment Temurin-11.0.18+10 (build 11.0.18+10) OpenJDK 64-Bit Server VM Temurin-11.0.18+10 (build 11.0.18+10, mixed mode)

OS Name: Microsoft Windows 10 Pro OS Version: 10.0.19044 N/A Build 19044 Name : Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz Total Physical Memory: 32,381 MB dt.hailfinder.csv

yasu-sh commented 1 year ago

image This the folder contents after execution. json file size has 0 kB whereas .txt has 5kB.

kvb2univpitt commented 1 year ago

@yasu-sh Thanks for attaching the dataset again for convenient. I'll take a look at it and get back to you on this.

yasu-sh commented 1 year ago

@kvb2univpitt Thanks for your action on this. I also checked at GUI, tetrad-gui-7.2.2-launch.jar. It also made the similar error when it made json-output. I wonder if the cause would be located in tetrad-lib.

Steps:

  1. starts GUI.
  2. places Data box.
  3. loads data: hailfinder with discrete mode.
  4. places Search box and connects from Data.
  5. searches with algorithm "GRaSP".
  6. places Graph box and connects from Search.
  7. From Graph box menu, exports at json format.
  8. OutOfMemory error displays on the console window.
kvb2univpitt commented 1 year ago

@yasu-sh Yes, you are correct! This is a tetrad-lib issue. I was having the same problem in the tetrad-gui.

jdramsey commented 1 year ago

I'll try to have a look at this soon. J-Joe

yasu-sh commented 1 year ago

@kvb2univpitt @jdramsey Thanks for your respose.

@yasu-sh Yes, you are correct! This is a tetrad-lib issue. I was having the same problem in the tetrad-gui.

I tried at different versions with the same dataset, hailfinder. I think this comes from Tetrad versions between 7.1.3 and 7.2.2(or 7.2.0?) causal-cmd versions

You might already notice this. I just expected this could be a clue for further findings.

espinoj commented 1 year ago

I need this function so I will attempt to fix @jdramsey

jdramsey commented 1 year ago

@espinoj Oh, thanks!

yasu-sh commented 1 year ago

@espinoj Thanks for your words on this issue! I hope you make this successfully solved.

yasu-sh commented 1 year ago

@espinoj @jdramsey Now I understand what happend when toJson runs. circular refrences happens in graph class. you cannot serialize due to the character-array-size limitation of byte in gson.

Paths and Underlines class are defines in graph class, and Paths and Underlines classes define graph class. After I addedtransient modifiers to graph, the error disappears and I can use the json data.

image

But I faced I cannot import json graph in tetrad UI. The root cause is the change of graph class since Endpoint class does not fit for importing json.

public enum Endpoint implements TetradSerializable {
    TAIL, ARROW, CIRCLE, STAR, NULL;
    static final long serialVersionUID = 23L;
}

So far I propose we made temporal solution, transient modifiers to two classes, Paths and Underlines. Could you give me some comments? I am not professional on Java, so I could make some overlooks on this.

jdramsey commented 1 year ago

We had someone who was going to look into the JSON issue; let me forward this to her.

jdramsey commented 1 year ago

I see your issue though--let me think a second...

jdramsey commented 1 year ago

You were right about the cyclic reference, my bad. I fixed that in my branch. The other problem I don't know how to deal with yet because I'm not a JSON expert. The issue is JSON doesn't seem to be able to read enums. back in, even though they're saved correctly in the JSON file. Hmm.... someone must have figured this out on google... I will look around for a solution...

jdramsey commented 1 year ago

@yasu-sh I fixed the JSON graph saving and loading bugs in Tetrad development branch and put the revised jar into py-tetrad (so rpy-tetrad) but in order to get it into causal-cmd Kevin needs to make a new version of causal-cmd. I think he's on vacation, but I'll ping him.

@kvb2univpitt I fixed the JSON saving and loading bug. Should we make an update to causal-cmd?

yasu-sh commented 1 year ago

@jdramsey I deeply appreciate your quick actions and fix this issue. I understand some of what you did. Underlines class is not refered by graph and related member variables are pushed into graph class... Many changes are required at derived classes.

I will try it at development branch and wil get back to you. I hope this fix is reflected to causal-cmd soon.

jdramsey commented 1 year ago

@yasu-sh Thanks!

yasu-sh commented 1 year ago

@jdramsey It works! You're so skilled! Json serialization did not work even small size at recent releases. It is helpful for using structured outputs at python, R and other languages.

By the way, I noticed that the json export method like Gsonbuilder(toJson i.e. serialization) works even if the cancel is pressed in file selection window in checking the phenomena in debug mode. Finally try-catch must work, but more sophiscated way could be if then like the code as below: image

jdramsey commented 1 year ago

@yasu-sh Oh wow! This is all new territory for me, but this looks great! I haven't done a lot of JSON saving and loading but it looks easy to do. Maybe I can do it for more types of objects in Tetrad.

jdramsey commented 1 year ago

@yasu-sh How does this help you from the point of view of R?

yasu-sh commented 1 year ago

@jdramsey Thanks for your comment!

Json serialization / import to R can omit troublesome text parsing. This is the screenshot of my sample project as below. every node name is anonymized. image

In my case, edgelist or bootstrapping result in edgesSet can be loaded by jsonlite package in R. R has dedicated functions for handle table data in one-line, ex. sapply, apply, mapply. So I said, structured data is beneficial for R, Python.

yasu-sh commented 1 year ago

I note that the specification/strucure of classes change always affects the json structure. So far json importing method works for me.

jdramsey commented 1 year ago

@yasu-sh Well, my goal is to have the class structures make sense so there will no urge to change them, of course. So it's nice to know about thing like reference cycles--thanks!

yasu-sh commented 1 year ago

Everything solved. Thank you very much!

yasu-sh commented 1 year ago

@kvb2univpitt I would appreciate if causal-cmd could be updated. Not urgent, but in coming weeks.

kvb2univpitt commented 1 year ago

@kvb2univpitt I would appreciate if causal-cmd could be updated. Not urgent, but in coming weeks.

Sure. I can update causal-cmd. @jdramsey Are you going to make an official release that includes this change?

jdramsey commented 1 year ago

Sure. I have it all prepared and have to do the release. Maybe later today? I need to back and find my checklist so I don't mess up. I think it's on Slack...

kvb2univpitt commented 1 year ago

@jdramsey Sounds good. Please let me know if I can help. Sorry, I have been busy with other things and haven't done anything with Tetrad.

jdramsey commented 1 year ago

@kvb2univpitt No worries :) 7.3.4 is now published! :) So anytime you have time to publish causal-cmd I'll push the button.

yasu-sh commented 1 year ago

@kvb2univpitt @jdramsey Thanks a lot. New tetrad version 7.4.0 is available and I downloaded, it works! I have noticed a small issue on GUI on Bootstrapping(probably just cosmetic). I will report on it on another issue in tetrad repos.

jdramsey commented 1 year ago

Great, glad it worked! We'll let you know when causal-cmd is updated as well.

yasu-sh commented 1 year ago

I have noticed a small issue on GUI on Bootstrapping(probably just cosmetic)

I failed in reproducing. I can report it in future when I face it. I've had no problems. Bootstrapping dialog disappeared when you secondly use search box with added knowledge box.

jdramsey commented 1 year ago

Let me know if you can reproduce it...

kvb2univpitt commented 1 year ago

@jdramsey @yasu-sh I just make a release of causal-cmd-1.9.0. @jdramsey I created a release branch for you to push to maven public repo: https://github.com/bd2kccd/causal-cmd/tree/release-v1.9.0

jdramsey commented 1 year ago

Just published it :)

https://s01.oss.sonatype.org/content/repositories/releases/io/github/cmu-phil/causal-cmd/1.9.0/

yasu-sh commented 1 year ago

@kvb2univpitt @jdramsey Thanks for your release! I confirmed on two algorithms: fges, grasp and fas(I thought it must have been simplest graph). fges and graps were okay after some modification(ex. endpoint's contents became 0 to character-TAIL)

A error came up with me when I just tried fas for simplest case. I am not sure why error happens.

I thought I need to try different algorthms and arguments. I did not try omit --json-graph this time. I will get back to this issue.

javaw -Xmx20G -jar \"causal-cmd-1.9.0-jar-with-dependencies.jar\" --data-type discrete --json-graph --delimiter comma --quote-char \\" --algorithm fas --dataset \"dt.tetrad.csv\" --prefix c-tetrad --knowledge prior.txt --seed 42 --numberResampling 0 --test g-square-test --alpha 0.05 --depth 10 --fasHeuristic 3 --stableFAS" java.lang.UnsupportedOperationException: Print to out for FAS is not used. at edu.cmu.tetrad.search.Fas.setOut(Fas.java:342) at edu.cmu.tetrad.algcomparison.algorithm.oracle.cpdag.Fas.search(Fas.java:64) at edu.pitt.dbmi.causal.cmd.tetrad.TetradRunner.lambda$runSearch$3(TetradRunner.java:214) at java.base/java.lang.Iterable.forEach(Iterable.java:75) at edu.pitt.dbmi.causal.cmd.tetrad.TetradRunner.runSearch(TetradRunner.java:213) at edu.pitt.dbmi.causal.cmd.tetrad.TetradRunner.runAlgorithm(TetradRunner.java:130) at edu.pitt.dbmi.causal.cmd.CausalCmdApplication.runTetrad(CausalCmdApplication.java:152) at edu.pitt.dbmi.causal.cmd.CausalCmdApplication.main(CausalCmdApplication.java:116)

jdramsey commented 1 year ago

Oh I see... @kvb2univpitt Is causal-cmd calling this method for all algorithms?

kvb2univpitt commented 1 year ago

@jdramsey Yes, causal-cmd does call the method void setOut(PrintStream out) for all algorithm. This is how algorithms write out to log file. However, causal-cmd doesn't call that method directly. It set the parameter printStream with a writer. The class edu.cmu.tetrad.algcomparison.algorithm.oracle.cpdag.Fas DIRECTLY calls the void setOut(PrintStream out) method inside the method public Graph search(DataModel dataSet, Parameters parameters):

@Override
public Graph search(DataModel dataSet, Parameters parameters) {
    if (parameters.getInt(Params.NUMBER_RESAMPLING) < 1) {
        ...

        Object obj = parameters.get(Params.PRINT_STREAM);
        if (obj instanceof PrintStream) {
            search.setOut((PrintStream) obj);
        }

        return search.search();
    } else {
        ...
    }
}

The class edu.cmu.tetrad.algcomparison.algorithm.oracle.cpdag.Fas needs to be fixed by removing the following:

 Object obj = parameters.get(Params.PRINT_STREAM);
  if (obj instanceof PrintStream) {
      search.setOut((PrintStream) obj);
  }
jdramsey commented 1 year ago

@kvb2univpitt Oh, I see. Let me fix that, then.

yasu-sh commented 1 year ago

@kvb2univpitt @jdramsey Thanks for quick resposne and confirmation. The Error message directly indicated the error happening. the error happens even at simple case. I confirmed several constraint algorithms. it looks working except fas.

java '-Duser.language=en' -jar causal-cmd-1.9.0-jar-with-dependencies.jar --algorithm fas --data-type continuous --dataset Retention.txt --delimiter tab --test sem-bic-test
java.lang.UnsupportedOperationException: Print to out for FAS is not used.
        at edu.cmu.tetrad.search.Fas.setOut(Fas.java:342)
        at edu.cmu.tetrad.algcomparison.algorithm.oracle.cpdag.Fas.search(Fas.java:64)
        at edu.pitt.dbmi.causal.cmd.tetrad.TetradRunner.lambda$runSearch$3(TetradRunner.java:214)
jdramsey commented 1 year ago

@yasu-sh Thanks so much! I know exactly what happened. :-)

jdramsey commented 1 year ago

By the way, thanks so much for checking these things. I would like to have a version of Tetrad soon with no known bugs for any search algorithms using any method of accessing them. So your comments are so helpful.

jdramsey commented 1 year ago

Also btw there is a but I know about on Windows, which I discovered using my new windows machine. In the interface, if you start a long process and try to use the cancel button to stop it, Tetrad won't stop. It may take a little work to fix that bug; I know that the bug it though.

yasu-sh commented 1 year ago

@jdramsey Yes. I sometimes experience slow or no response in search box as you noticed. After the cancellation sometimes makes the some UI parts missing or needed to restart tetrad since the search does not work any more. If the symptons become fixed, it will be great news for me.

This is my trial code to check different algorithms. Not covered everything but no needed to input everything. Powershell code: this is just a sample code "grasp", "ccd","cpc","cstar","fas","fci","fcimax","pc","pc-mb","pcmax","rfci","svar-fci" | ForEach-Object{try{java '-Duser.language=en' -jar causal-cmd-1.9.0-jar-with-dependencies.jar --algorithm $_ --data-type continuous --dataset Retention.txt --delimiter tab --test sem-bic-test} catch { Write-Output "Error at: $_"} finally { Write-Output "Executed Algorithm: $_, Last Exit Code: $LASTEXITCODE"}} | Tee-Object -FilePath "./test-algorithms-checked.txt" Indication of screen: algorithm-check-verbose.txt

jdramsey commented 1 year ago

Nice. Thanks--I'll look at that stuff...

yasu-sh commented 1 year ago

@jdramsey @kvb2univpitt Thanks a lot. I hope you're enjoying.