computationalprivacy / bandicoot

an open-source python toolbox to analyze mobile phone metadata
MIT License
237 stars 61 forks source link

individual.py percent_pareto_interactions formula question #28

Closed paulvercoustre closed 3 years ago

paulvercoustre commented 3 years ago

In individual.py, the function percent_pareto_interactions is computed with: return (len(user_count) - len(user_sort)) / len(records)

I do not understand why we divide by len(records). Shouldn't it be divided by len(user_count) given that we are computing the percentage of the user's contacts?

The line would thus be: return (len(user_count) - len(user_sort)) / len(user_count)

Apologies if I have missed something.. P

cynddl commented 3 years ago

Hi @paulvercoustre, sorry for the late response. The code should indeed use len(user_count), thanks a lot for catching this!

This has now been fixed in the master branch by @ana-mariacretu (fix here: https://github.com/computationalprivacy/bandicoot/commit/16227480e1a89b4c08164e65516ee2ac20149c98, tests here: https://github.com/computationalprivacy/bandicoot/commit/86e5f192a67dff690a06f28ed2f7b1ffdd141efb).

Please let me know if you see any other issue that need to be fixed, never too late to catch bugs.