Closed yasu-sh closed 1 year ago
@yasu-sh Good point about t-depth. I removed this (old) parameter from the Tetrad manual in my branch, sundry_fixes_7_3_0, which will be merged into development after review. In any case, FGES is no longer using that parameter, though it is using the maxDegree parameter.
@yasu-sh By the way, I'm going to take a break from py-tetrad and look at some of these other issues that have been reported again...
@yasu-sh I'll pull down Hailfinder again and see if I can figure out what's going on with the times, though I guess I want to know what's happening in Tetrad first...
@yasu-sh Oh, that is interesting--I put this example in py-tetrad, and FGES definitely hangs, but BOSS gets it almost immediately. I commented on this before and recommended I remove two variables, but let me think about it some more, why is FGES hanging? This doesn't have to do specifically with causal-cmd, it seems...
@yasu-sh (Sorry I'll move this to another project, trying to make py-tetrad more stable...)
@yasu-sh I built this in a py-tetrad project and once again observe that BOSS and GRaSP get it almost right away but as you say FGES hangs if you include the variables with 11 categories.
@jdramsey Thanks for your reply and investigation. I am satisfied to know that tDepth issue will be solved soon. I have seen somewhere about a comment, "FGES is good for "sparse network", like 2-3 avg. degrees." Specs. on v1.6.1: maxDegree = 1000(Default and configurable), tDegree = 10000(Fixed).
You found data columns with many categories made combinational explosure at recent release. It makes sense. I think the symptom is reasonable and I decided moving on to GRaSP. This issue would be time-consuming and not urgent. let's move forward small steps. I might make overlooking. Our matter is py-causal, isn't it?
Yeah--py-tetrad. Anyway I'll keep the FGES problem in the back of my mind...
@jdramsey I made debug executions and compared the release v6.8.0 and v6.8.1.
Key code is "insertEval." double bump = insertEval(a, b, T, naYX, hashIndices);
I wonder if you found some problematic behaviors in skipping already found cliques and you changed to exaustic execution of insertEval function at v6.8.1.
ex. 1
For loops may try to access outside of Bounds in TNeighbors at `for (int i = 0; i <= TNeighbors.size(); i++)`
ex. 2
break FOR; in while loops and stop everthin in the function
I think this is the main cause of the many loggamma execution.
In v6.8.0 code, the for loop easily exit by break FOR;
(ex.2 above).
This change would affect the parallelization design for the fges.
As for dt.hailfinder.csv(bnlearn) dataset, the chunks's parallelized division does work at one-thread only.
But your implementation is the reference of fges. I may make mistake, some other datasets will make other outcomes. I wish you remember your intention on this change. I think your improvement would have gone right direction.
Thanks! I will look into that soon.
I am glad to hear that. I tried to imagine the intention and trace through step executions, but gave up. ex. checked choice generator's result. 0-starts and 1-starts index could make inconsistencies. (Often happens in robotics)
Reference of the function place in the screenshot at the previous comment. edu.cmu.tetrad.search.Fges#calculateArrowsForward
@jdramsey I think it is time to close this. we can move on to the new algorithms.
Could you consider on removing tDepth option and manual description? tDepth option displays with --help-all causal-cmd v 1.6.1. But it did not work as the table below. I noticed this during I was checking what happened between v1.2.1 and v1.2.2.
And... I noticed both tetrad 6.8.1(in UI) and causal-cmd 1.2.2 can accept tDepth and tDepth = 1 has similar calcluation time for hailfinder, as #83. If you set tDepth = 3 in causal-cmd ver. 1.2.2, the time become considerable worse since the algorithm count T or H as combiation counts for depth direction. I guess it may be for avoiding local optima(I read the code a little). Definitely I noticed @jdramsey made a lot of effort for this and finally you concluded the tdepth = 10000 cannot be changeable. ex. hashmaps . https://www.jetbrains.com/help/idea/tutorial-find-a-memory-leak.html
Some other users may think FGES is ultra-fast. I wanted to report the current figure. Thanks for your improvement for tetrad and leading new scientific findings.
tDepth acceptance table
command line's help description
In Tetrad manual
https://cmu-phil.github.io/tetrad/manual/#:~:text=Value%20Type%3A%20String-,tDepth,-Short%20Description%3A
Execution code example
PS > (Measure-Command { java -Xmx25G -jar ".\causal-cmd-1.4.1-jar-with-dependencies.jar" --data-type discrete --delimiter comma --algorithm fges --score bdeu-score --dataset dt.hailfinder.csv --verbose --default | Out-Default }).TotalSeconds
(--default option is used if possible)dataset: hailfiner (56var x 20000samples in bnlearn)