BasisResearch / collab-creatures

Analyzing animal collaboration with Bayesian and causal inference.
4 stars 1 forks source link

ensure NA handling is clear and explain how nans arise #67

Open rfl-urbaniak opened 3 months ago

rfl-urbaniak commented 3 months ago
          > Why are some values missing in the simulated data? (missing values are subsequently dropped -- I want to provide an explanation why they are missing in the first place)
  • if you're at the last frame, there will be no how_far as it is suppossed to be generated by the +1 frame, which doesn't exist.
  • in some cases where standardization is applied to predictor that is 0 (such as communication for birds that don't communicate), if you just standardize this involves division by 0.

Thanks, made edits to reflect all this, and will push them. The one thing I'm still concerned about is the exclusion of points where the predictor is zero, since I don't want data exclusions to influence the results. Standardization typically has a special case to avoid nans when dividing by zero. I'm also a bit confused because, even if the simulated birds aren't communicators, the communication predictor score should sometimes be non-zero. Happy to jump on a call to discuss.

Originally posted by @emackev in https://github.com/BasisResearch/collab-creatures/issues/66#issuecomment-1992039591

rfl-urbaniak commented 3 months ago

I think nans are worth having a short discussion once I spend a bit more time reminding myself how exactly they arise and double checking the presence of the special case (I vaguely recall introducing it). But at this point I just made an issue about it, as I don't think this should block progress on the documentation.