BasisResearch / collab-creatures

Analyzing animal collaboration with Bayesian and causal inference.
4 stars 1 forks source link

added the locust pipeline notebooks #39

Closed rfl-urbaniak closed 4 months ago

rfl-urbaniak commented 6 months ago

...together with all that was needed to make it work in the refactored setting. Streamlined notebook formatting and testing, so that now make format will also use black and isort on the notebooks, also removing unused imports. make lint will proceed with analogous tests, and make tests will not only run whatever tests are in the tests folder, but also run all notebooks in the docs folder in the continuous integration mode (CI), effectively downsizing the numbers of samples, iterations etc. for the test runs.

Also, the model has now been refactored and until further discussion the discretized modeling has been removed, as the continuous regression now gives the expected result and is much faster.

emackev commented 4 months ago

I took another look, after our discussion about possibly changing the paper figure to a new figure version.
This is the new version (in the notebook):

image

And this is the version in the overleaf:

image

I notice the proximity and trace distributions are now slightly negative, while before they were ~0, or slightly positive. I want to understand the differences between the models better. Is it correct that the new model is simpler/ more linear, not compartmentalized?

Stylistically, I prefer when the three distributions are all on the same axis (or at least having the axes have the same scale), so they are easy to compare.

I also went through the notebooks in ru-dynamical-locust (data, class, validate, interpret), and I want to add some documentation -- I can take a first pass at this. It's a good way to make sure I really understand it. Let me know if/when I should make a PR for this, and/or contribute to a PR you're making.

rfl-urbaniak commented 4 months ago
emackev commented 4 months ago
  • the mother is not compartmentalized, it's still linear with sd also linearly dependent on the predictors. I can talk you through the model code if you want. In principle, I think there still might be some improvements to how we perform the inference downstream from predictor derivation, but also I think this is good enough for now and we can revisit this at a later point.
  • I also modified the figure so that it's three in one. I think if you have the time you could revise your visualizations in the locust_pipeline notebook to be light-themed and printing friendly, but also I don't think this is urgent. I think the coefs are slightly negative as in the presence of communication the proximity of others actually is slightly negatively predictive of where they would go (as they'd prefer communicates). Again, something to pay attention to in later modeling improvements.

Thanks, yes it'd be helpful to walk through it together, and/or point me to which functions/versions to focus on. I'm especially trying to understand why some of the coefficients are now inferred to be negative, where they were positive in the previous version of the model. Either way seems reasonable, I just want to better understand the change in the model that led to this.