hetio / hetnetpy

Hetnets in Python (relocated from dhimmel/hetio)
https://het.io/software
Other
93 stars 28 forks source link

hetio.permute.permute_pair_list modifies pair_list in place #28

Closed zietzm closed 5 years ago

zietzm commented 5 years ago

I recently discovered that permute_pair_list modifies the pair list it is given, in place. The returned pair list is then equivalent to the pair list that was passed as an argument to the function itself. I had not expected this behavior. There is no bug in the code that causes this. I am simply wondering if this is in fact the function's desired behavior. I could imagine that one wouldn't always want to modify an edge list in place.

image

dhimmel commented 5 years ago

I agree this is dangerous.

I searched GitHub for permute_pair_list, which returned 12 code files (all related to or using hetio). Most files are based on this notebook from Project Rephetio. Cell 5 defines treatments as a list of node pairs:

treatments = list(zip(treatment_df.compound_id, treatment_df.disease_id))

The permutation shuffles treatments in place. Cell 26 appears to assume treatments hasn't been permuted. So yes, this dangerous behavior may have caused some downstream damage. Specifically, the empiric facets of the degree-grouped edge existence probability plots here may be incorrect.

I think it makes sense for permute_pair_list to initially copy the list, unless in_place=True is specified. @zietzm want to make the PR?