86 1 multidms softplus clipping

jgallowa07 commented 12 months ago

This PR implements the changes outlined in #86 and is running on the test_example data on my end. It is still a work in progress (and thus draft PR) until @Caleb-Carr and I get a chance to review the results of this branch on some data other than the test_example. After that we will outline thoughts, bugs, and feedback and re-iterate for a merga-able PR submission.

Notes to expand upon:

This updates fixes a logical bug in the func effect shifts notebook - this means that folks results will inevitably be different.
A small ridge penalty on the beta's keeps the GE fits more stable - exposed a set of kwargs for multidms fitting in func effect config
The output format of the averaged func effects notebook has changes as we're using the new interface. This still needs to be clarified
The output of the individual shifts inference notebook now includes the pickle file encoding the sweep of fits across lasso shift coefficients
Could potentially add more heatmaps in all notebooks for shifts, predicted func scores, and beta's alike. The infrastructure is simple.

jbloom commented 10 months ago

@jgallowa07, is this something I should review now, or are you still working on it?

Can you just tag me either way when it is ready for me to look at?

jgallowa07 commented 10 months ago

@jbloom Still a work in progress. I'll convert it to a full-fledged PR and request your review when it's ready for a look over. Thanks

jbloom commented 7 months ago

@jgallowa07, I was curious on the status of this pull request, and if it is a long way from merging if there is just something we can do about the issue with the current version that causes problems with newer version of pandas: https://github.com/matsengrp/multidms/issues/128

jgallowa07 commented 7 months ago

Sorry @jbloom - I should have checked in with you.

I'm guessing you have not talked with Hugh or Will about the convergence issues we're currently trying to fix by implementing a generalized fused-lasso approach? I've just validated that a prototype of this model produces very similar results on spike & simulated data, but am just now starting to hack at the actual implementation into the multidms source code. Inevitably this will require more changes the workflow here once that's finished.

So I see two options:

I could get this PR working now with a version of multidms that fixes the pandas issue if you don't want to wait on the changes above. It's just a bit of duplicated effort but I'm okay with that.
Wait till we've finished the above (~ 3ish weeks is my guess given the amount of changes needed), and disable the multidms workflow from this pipeline in the meantime.

Happy to do whatever you think is best. Also happy to set up a meeting if you have time and would like to discuss these changes.

jbloom commented 7 months ago

If it is only ~3 weeks, that is fine for now.

jgallowa07 commented 6 months ago

@jbloom Update: We've settled on an approach and I'm in the process of updating multidms. I'm hoping to have it done by EOW, and then have this finished by the end of next week.

jgallowa07 commented 6 months ago

Sorry @jbloom - I'm the only one doing the coding for multidms, and with this major refactor and balancing other projects I've fallen behind schedule a bit. I'm hopeful that I'll have a stable version by the end of this week so I can update this pipeline for you.

jgallowa07 commented 5 months ago

multidms 1.0.0 was just released. Working on this PR now.

dms-vep / dms-vep-pipeline-3

86 1 multidms softplus clipping #91