xgi-org / xgi

CompleX Group Interactions (XGI) is a Python package for higher-order networks.
https://xgi.readthedocs.io
Other
180 stars 28 forks source link

Add stats features #518

Closed nwlandry closed 7 months ago

nwlandry commented 7 months ago

Partially addresses #514.

codecov[bot] commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.05%. Comparing base (0f6b389) to head (75bdd77). Report is 2 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #518 +/- ## ========================================== + Coverage 92.03% 92.05% +0.01% ========================================== Files 60 60 Lines 4370 4378 +8 ========================================== + Hits 4022 4030 +8 Misses 348 348 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

maximelucas commented 7 months ago

Looks good thanks Nich! Do you think there's a way to use numpy's argmin/max to get all indices rather than just the first? Would we want that?

nwlandry commented 7 months ago

Looks good thanks Nich! Do you think there's a way to use numpy's argmin/max to get all indices rather than just the first? Would we want that?

That could be good, but unfortunately numpy does the same thing (returns the first max/min). I can look more into it though!

nwlandry commented 7 months ago

I found this: https://stackoverflow.com/questions/25762332/how-to-get-all-the-keys-with-the-same-highest-value, but I'm not sure if we want to return an iterate. Let me know what you think!

maximelucas commented 7 months ago

That could be good, but unfortunately numpy does the same thing (returns the first max/min). I can look more into it though!

Ah true, my bad.

but I'm not sure if we want to return an iterate. Let me know what you think!

Yea I'm not sure either honestly. Maybe let's keep it as it is for now (it's pretty standard to return just the first occurrence), and see later? You documented it so it should be okay for now.

Ultimately, say we're looking for the nodes that have minimum degree. We might want to get all of them, not just the first. If we don't implement this by default, would there be a way to do it by hand by iterating for the user? And if we return all of them by default, we should check for format consistency with other stats to avoid potential problems.

nwlandry commented 7 months ago

That could be good, but unfortunately numpy does the same thing (returns the first max/min). I can look more into it though!

Ah true, my bad.

but I'm not sure if we want to return an iterate. Let me know what you think!

Yea I'm not sure either honestly. Maybe let's keep it as it is for now (it's pretty standard to return just the first occurrence), and see later? You documented it so it should be okay for now.

Ultimately, say we're looking for the nodes that have minimum degree. We might want to get all of them, not just the first. If we don't implement this by default, would there be a way to do it by hand by iterating for the user? And if we return all of them by default, we should check for format consistency with other stats to avoid potential problems.

I think that we can accomplish this in two ways: (1) add the ability to argsort() and (2) make a recipe to show how to get all max IDs. I added a note to #514 and opened Issue #520 corresponding to the recipe.