SystemsGenetics / KINC

Knowledge Independent Network Construction
MIT License
11 stars 4 forks source link

138 cond test mpirun #139

Closed spficklin closed 4 years ago

spficklin commented 4 years ago

Description

This PR fixes issue #138. To summarize, there was a bug if the cond-test analytic was used with MPI. All of the clusters would have significant edges. This is because the p-value tests never ran because the class was not properly instantiated for the MPI slaves. Since the p-value defaults to 0 it looked like everything was significant. This only affected results if MPI was used.

To fix this, I updated the initialize function of the ConditionTest class so that the slaves could initialize their instance of the class.

Also, in the process of debugging this, I fixed a lot of weirdness in the code. It's a bit more streamlined in terms of layout now. This weirdness did not affect the results. It just made the code harder to figure out. I think it's a bit easier now.

Testing

To test this PR, do the following prior to pulling this branch:

  1. Run the example/kinc-gmm-run.sh script to get correct results. Save these results in a temp directory for later comparison
  2. Run the example/kinc-gmm-mpirun.sh script. You'll see that all of pairs have signficant tests with a p-value of 0.
  3. Next pull this branch of the code and re-run step #2.
  4. Compare the results of step #3 with step #1. There will be a little bit of discrepancy but not much. The differences are due to issue #137 and all should be fine.

Additional Info

@4ctrl-alt-del or @bentsherman would one of you look over the code to make sure I'm following design standards and I didn't do anything inappropriate with my edits.

bentsherman commented 4 years ago

@spficklin Is is possible for the cluster_size to ever be zero when reading from the ccm/cmx? If an output pair has no clusters in similarity or corrpower it isn't saved, so I don't think we need to even worry about that condition.

bentsherman commented 4 years ago

@spficklin You're code looks good and ConditionalTest looks way cleaner than before. My goodness. I removed the check for zero clusters because that should never happen, both similarity and corrpower (and cond-test as well for that matter) are designed to not save pairs with zero clusters so we don't have to worry about that possibility. Ready to merge!

spficklin commented 4 years ago

Hey @bentsherman that check was necessary. I agree there shouldn't be clusters that are size zero but there were. Without that check the analytic was failing.

spficklin commented 4 years ago

Nevermind... looks like all is good and zero-sized clusters was some side effect of the problem. It seems fine now.