Adding documentation to rhf notebook

emackev commented 6 months ago

Adding documentation. Questions along the way:

Why are some values missing in the simulated data? (missing values are subsequently dropped -- I want to provide an explanation why they are missing in the first place)
Can you confirm that how_far_score is highest at the location where the agent actually moved, and drops off exponentially from there, and is only computed at locations close enough to the agents current location that the agent is able to jump there?

emackev commented 6 months ago

To do: consider adding some explanation how to interpret the coefficient on the visibility parameter

review-notebook-app[bot] commented 6 months ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

rfl-urbaniak commented 6 months ago

Why are some values missing in the simulated data? (missing values are subsequently dropped -- I want to provide an explanation why they are missing in the first place)

if you're at the last frame, there will be no how_far as it is suppossed to be generated by the +1 frame, which doesn't exist.
in some cases where standardization is applied to predictor that is 0 (such as communication for birds that don't communicate), if you just standardize this involves division by 0.

Can you confirm that how_far_score is highest at the location where the agent actually moved,

Correct for the transformed score, not correct for just how_far, read below.

and drops off exponentially from there,

Not exactly, take a look at the formula in the paper. take a look at the data frame, there should be a few columns whose names start with how_far. If I remember well, how_far itself is ja squared distance, then it is transformed and scaled:

_hf["how_far_squared"] = (_hf["x"] - x_new) ** 2 + (_hf["y"] - y_new) ** 2
            _hf["how_far_squared_scaled"] = (
                -_hf["how_far_squared"]
                / (2 * (sim.step_size_max + sim.visibility_range) ** 2)
                + 1
            )

and is only computed at locations close enough to the agents current location that the agent is able to jump there?

This is computed for all the (sampled) points that the bird considers for the next move, within their visibility. Whether the agent is capable everywhere they can reach with their sight is a separate issue.

consider adding some explanation how to interpret the coefficient on the visibility parameter

Yes, so visiblity is predictive as all agents in this notebook move within their visibility range, to closer (=more visibile) [pomts] more often than the ones that are further.

review-notebook-app[bot] commented 6 months ago

View / edit / reply to this conversation on ReviewNB

rfl-urbaniak commented on 2024-03-12T09:59:11Z ----------------------------------------------------------------

not sure whether double dash shouldn't be replaced with a single one.

interpreted as indicative of how predictive each factor is of agents' ...

emackev commented on 2024-03-12T14:00:32Z ----------------------------------------------------------------

fixed, will push

review-notebook-app[bot] commented 6 months ago

View / edit / reply to this conversation on ReviewNB

rfl-urbaniak commented on 2024-03-12T09:59:11Z ----------------------------------------------------------------

simulate

emackev commented on 2024-03-12T14:00:58Z ----------------------------------------------------------------

fixed, will push

review-notebook-app[bot] commented 6 months ago

View / edit / reply to this conversation on ReviewNB

rfl-urbaniak commented on 2024-03-12T09:59:12Z ----------------------------------------------------------------

Please make sure you re-run the whole notebook with all the cells so that all outputs exist when you're done.

emackev commented on 2024-03-12T16:21:06Z ----------------------------------------------------------------

done

emackev commented 6 months ago

fixed, will push

View entire conversation on ReviewNB

emackev commented 6 months ago

fixed, will push

View entire conversation on ReviewNB

emackev commented 6 months ago

Why are some values missing in the simulated data? (missing values are subsequently dropped -- I want to provide an explanation why they are missing in the first place)

if you're at the last frame, there will be no how_far as it is suppossed to be generated by the +1 frame, which doesn't exist.

in some cases where standardization is applied to predictor that is 0 (such as communication for birds that don't communicate), if you just standardize this involves division by 0.

Thanks, made edits to reflect all this, and will push them. The one thing I'm still concerned about is the exclusion of points where the predictor is zero, since I don't want data exclusions to influence the results. Standardization typically has a special case to avoid nans when dividing by zero. I'm also a bit confused because, even if the simulated birds aren't communicators, the communication predictor score should sometimes be non-zero. Happy to jump on a call to discuss.

emackev commented 6 months ago

done

View entire conversation on ReviewNB

rfl-urbaniak commented 6 months ago

I think nans are worth having a short discussion once I spend a bit more time reminding myself how exactly they arise and double checking the presence of the special case (I vaguely recall introducing it). But at this point I just made an issue about it, as I don't think this should block progress on the documentation.

BasisResearch / collab-creatures

Adding documentation to rhf notebook #66