cmu-phil / py-tetrad

Makes algorithms/code in Tetrad available in Python via JPype
MIT License
50 stars 9 forks source link

How can I get only the stable graph when running the LiNG-D algorithm? #21

Closed wangzhuofan closed 3 months ago

wangzhuofan commented 4 months ago

I generated data from the linear non-Gaussian structural equation model and used the function run_ica_lingd(). The LiNG-D prints several candidate models with only one stable model which is exactly what I want. I'm doing some simulations with many replicates. It's hard to select the stable model from all the printed candidate models with so many replicates. So I am wondering if there is any way to only return the stable model? Thanks!

jdramsey commented 4 months ago

Hold on, I may have broken something... let me fix the bug first and then I'll adjust that code, sorry...

jdramsey commented 4 months ago

I've fixed the bug and adjusted the code, but I want to think about the LiNG-D algorithm a bit more.

jdramsey commented 4 months ago

I've made several improvements in LiNG-D, which I'm going to post soon for py-tetrad. However, after reading your comment, I realized that I didn't do exactly what you suggested, which was to print only the stable models. Let me do that before posting.

I changed the parameterization a little. Now, there's a W threshold and a B threshold. The W threshold sends small values in the W matrix to zero for running the N Rooks procedure, and the B threshold sends small coefficients to zero for the final graphs. This makes more sense and is pretty intuitive.

I had stupidly introduced a bug in the ICA procedure used by ICA-LiNGAM and ICA-LiNG-D. I fixed that bug and added unit tests to prevent it from happening again.

Both procedures work well in simulation for acyclic models, and LiNG-D produces stable models.

Here's what I'll do to fix your issue. If the verbose flag is set to false, I'll print only the stable models, and if it is set to true, I'll additionally print the unstable models. I also added some code to make sure duplicate models don't get printed, which should help cut down on the output length, even if verbose is true.

Let me fix that, test, and post a new 'current' jar to py-tetrad, test there, and push to GitHub. I'll also update the documentation.

jdramsey commented 4 months ago

OK, it's working on the Java end. Let me put it into py-tetrad now and test it.

jdramsey commented 4 months ago

OK, if you do a git pull, you should get the updated versions of ICA-LiNGAM and ICA-LiNG-D. They can be used as indicated in the run_continuous.py module:

print('ICA-LiNGAM')
search.run_ica_lingam(threshold_b=0.1)
print(search.get_string())

## Set verbose to True to print unstable models; otherwise, only stable models will be printed.
print('ICA-LiNG-D')
search.set_verbose(False)
search.run_ica_lingd(threshold_b=1, threshold_w=1e-4)
## The algorithm will return one of the stable models, or an empty graph if there is none. But the above should
## print all of the stable models if verbose is set to False.
# print(search.get_string())

Let me know if you have trouble

jdramsey commented 4 months ago

I also updated the Javadoc documentation. Here's the link for LiNG-D:

https://www.phil.cmu.edu/tetrad-javadocs/7.6.4-snapshot/edu/cmu/tetrad/search/IcaLingD.html

wangzhuofan commented 4 months ago

OK, if you do a git pull, you should get the updated versions of ICA-LiNGAM and ICA-LiNG-D. They can be used as indicated in the run_continuous.py module:

print('ICA-LiNGAM')
search.run_ica_lingam(threshold_b=0.1)
print(search.get_string())

## Set verbose to True to print unstable models; otherwise, only stable models will be printed.
print('ICA-LiNG-D')
search.set_verbose(False)
search.run_ica_lingd(threshold_b=1, threshold_w=1e-4)
## The algorithm will return one of the stable models, or an empty graph if there is none. But the above should
## print all of the stable models if verbose is set to False.
# print(search.get_string())

Let me know if you have trouble

Thank you so much! It works well in my simulations to get the stable model. This helps me a lot!

jdramsey commented 4 months ago

Awesome! :-)

Are you able to close the issue now?

jdramsey commented 4 months ago

Oh maybe not--hold on--this isn't in the published version of Tetrad yet--just in py-tetrad. We may publish a new version soo.

jdramsey commented 4 months ago

By the way someone else had asked about LiNG-D and I updated the API a bit further:

https://github.com/cmu-phil/py-tetrad/issues/22

jdramsey commented 3 months ago

This is done, closing.