pik-copan / pyunicorn

Unified Complex Network and Recurrence Analysis Toolbox
http://pik-potsdam.de/~donges/pyunicorn/
Other
195 stars 87 forks source link

BUG: dtype mismatch error in vg.retarded_local_clustering() #163

Closed pucicu closed 1 year ago

pucicu commented 2 years ago

When calling vg.retarded_local_clustering() a dtype mismatch error occurs:

ValueError: Buffer dtype mismatch, expected 'ADJ_t' but got 'short'

(the same for vg.advanced_local_clustering())

lenas95 commented 2 years ago

Hi @pucicu ,

could you please elaborate on which data you used and how your code snippet looks like, so that we can reproduce it?

Greetings

pucicu commented 2 years ago

The analysis here https://zenodo.org/record/5578298 worked in the past, but now pyunicorn now throws the error ValueError: Buffer dtype mismatch, expected 'ADJ_t' but got ‘short’

Simply use the data and analysis from this repository and let me know if it works for you.

Thanks and best wishes Norbert

Am 24.06.2022 um 13:36 schrieb lschmidt @.***>:

Hi @pucicu https://github.com/pucicu ,

could you please elaborate on which data you used and how your code snippet looks like, so that we can reproduce it?

Greetings

— Reply to this email directly, view it on GitHub https://github.com/pik-copan/pyunicorn/issues/163#issuecomment-1165487732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHNNEGPBTXJN2N4VKD24BTVQWMTPANCNFSM5WB77YAQ. You are receiving this because you were mentioned.

fkuehlein commented 1 year ago

Update:

Problem appears to be within cython modules

error message:

src/pyunicorn/timeseries/visibility_graph.py in retarded_local_clustering(self)
    254         norm = retarded_degree * (retarded_degree - 1) / 2.
    255 
--> 256         _retarded_local_clustering(N, A, norm, retarded_clustering)
    257         return retarded_clustering
    258 

src/pyunicorn/timeseries/_ext/numerics.pyx in pyunicorn.timeseries._ext.numerics._retarded_local_clustering()

ValueError: Buffer dtype mismatch, expected 'ADJ_t' but got 'short'
fkuehlein commented 1 year ago

problem:

pyunicorn/timeseries/visibility_graph.py in self.retarded_local_clustering()

quick fix/workaround

pyunicorn/timeseries/visibility_graph.py in self.retarded_local_clustering()

thorough fix:

_[resolved] dtype of norm and retarded_clustering:_

initialize as np.zeros(self.N, dtype="float32") within VisibilityGraph (lines 219 and 244)

[not resolved] dtype of A:

pyunicorn/timeseries/visibility_graph.py in self.__init__()

pyunicorn/core/network.py in self.set_adjacency()

ntfrgl commented 1 year ago

Thank you for the detailed analysis, @fkuehlein! You have identified two oversights in 3dab5bf89d2e224fc319ddd64aeeecc480f27fba.

The easy thing first: The original types for the arguments norm and retarded_clustering were the deprecated FLOATTYPE=np.float, which should have been translated as DFIELD=np.float64 instead of FIELD=np.float32 in the Cython function signature. I have fixed that now for all such cases in 2b489da9e504dfbdb621848bb82388f510c046ca.

For the argument A, the appropriate solution is to downcast to ADJ_t at the call site, using the helper core._ext.types.to_cy() that exists for this purpose. I will leave that to you, and kindly ask you to add a test that triggers all possible scalar types for the adjacency matrix, in a similar fashion to test_generic.simple_instances(). Please look out for other places where a similar pattern might occur.


There are several layers of causation to this bug, which I document here in hopes that it will inform the resolution and mitigation of other similar bugs:

fkuehlein commented 1 year ago

Thanks for your help and the detailed additional info, @ntfrgl!

The original issue should be resolved as of 1651a2b. For now, that commit remains on my fork and I will create a pull request as soon as I have cleared at least the rest of VisibilityGraph of similar possible occurrences.

I have two questions remaining in order to proceed on this:

  1. Appropriate solution for these cases

    As I have understood, the minimal solution (which you have pursued so far) would be to generally pass arguments arg of a cython module _ext.numerics._mod(arg) with help of _ext.numerics._mod(to_cy(arg, ARGTYPE)) to make sure they are cast to the expected type ARGTYPE (as is done throughout core.Network).

    All remaining variables that might be affected by this within VisibilityGraph, however, are already explicitly set to the required dtype on initialization. Namely:

    • The empty placeholder for the adjacency matrix A, which is correctly initialized to be of dtype=MASK. I guess I keep that as it is.

    • VisibilityGraph.time_series and VisibilityGraph.timings, which are initialized to be of dtype=np.float32. Keep that, or put dtype=FIELD there instead?

  2. Implementation of a respective test

    I'm not sure what you mean by "trigger all possible scalar types for the adjacency matrix". self.adjacency is a property of the Network class. So maybe the idea is to minimally initialize all classes that inherit from Network and check what scalar types they return for self.adjacency. Does that make sense? Or else, could you elaborate on that?

ntfrgl commented 1 year ago

Short-term answer

  1. Solution:
    • Apart from mistakes like the ones above, the original commit introducing semantic type aliases followed a manual, defensive approach. The goal was to use typed initialisation where it is apparently valid (i.e., the variable is initialised within the current scope and no type-altering methods are applied before the Cython call), and to type cast directly at the call site otherwise.
    • This means that most of the work should already be done, but it is expected that a few mistakes not triggered in the current test suite need to be fixed, and that some unnecessary copy/cast operations might be avoidable with a more precise type inspection.
    • It would be preferable to replace the remaining non-alias types with the semantic aliases where possible.
  2. Testing:
    • In the current bug's case, the problematic type mutation occurs depending on the node count, which means that in order to guard against future instances of similar bugs, the test for this method should be parametrised with several different network sizes to ensure that all possible cases of Cython calls are covered.
    • The reliability and reusability of the test suite would be improved if this was done using parametric test fixtures, which exist for such scenarios.
    • And btw, if in the process of looking for other similar situations you find it helpful to search for ADJ_t, then the semantic type aliases are fulfilling part of their purpose.

Long-term answer (worth discussing and creating a new issue for)

  1. Solution:
    • The above is really an incomplete substitute for a proper type analysis, which has traditionally not been available in the dynamically typed Python language, and is held together by non-mechanised reasoning and incomplete testing. Given recent developments in the Python language and its library ecosystem, a much more reliable and efficient approach would be to use the new NumPy API type annotations. These could be introduced incrementally throughout the codebase, and would allow static type checking of the numerical types. I expect that this will be straightforward for the most part with few exceptions, that it might expose some well hidden bugs, and that it would make the library a lot more sustainable.
  2. Testing:
    • In addition to the above extensions of the unit tests, Mypy would execute the static type checking in combination with the plugin for NumPy API types.
fkuehlein commented 1 year ago

Thanks a lot for your help and elaborations.

I fixed the above case accordingly in VisibilityGraph. I also looked through all of pyunicorn.timeseries and found no more conflicting types of the adjacency matrix, they're all safely cast to ADJ already.

I will therefore make a pull request now so the original issue can be fixed.

Then I will look into creating an appropriate test for these cases. Thanks for the hint to the existence of parametric test fixtures! Maybe we can briefly discuss your 'long-term anser' next week with @jdonges' help.

fkuehlein commented 1 year ago

Ok, actually found another dtype=np.uint8 in timeseries._ext.numerics._visibility(). The function in question is only used within test_timeseries.testVisibility() though, which is currently commented out.

Fixed both the test and the cython function with 91836d4 for a start, on a new branch. Still, the test only covers a specific testing method, so test coverage is lacking (let alone parametrisation). I will therefore further look into enhancements.

Putting this here just for documentation.

ntfrgl commented 1 year ago

Let me add to the critical observations in your latest commit message, since it doesn't have an associated issue yet:

ntfrgl commented 1 year ago

Resolved by #183.