jalammar / ecco

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
https://ecco.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.96k stars 167 forks source link

IndexError when using analysis.pwcca #75

Closed demkejon001 closed 1 year ago

demkejon001 commented 2 years ago

OS Info

Operating System: Ubuntu 18.04.6 LTS Kernel: Linux 5.4.0-100-generic Architecture: x86-64

Environment Info

python 3.6 ecco==0.1.2

Replicate

import ecco.analysis as analysis
a = np.random.rand(4, 200)
b = np.random.rand(8, 200)
analysis.pwcca(a, b)  # This works
analysis.pwcca(b, a)  # It fails here

Error:

~/Documents/TOMMAS/venv/lib/python3.6/site-packages/ecco/svcca_lib/pwcca.py in compute_pwcca(acts1, acts2, epsilon) 47 else: 48 dirns = np.dot(sresults["coef_y"], ---> 49 (acts1[sresults["y_idxs"]] - \ 50 sresults["neuron_means2"][sresults["y_idxs"]])) + sresults["neuron_means2"][sresults["y_idxs"]] 51 coefs = sresults["cca_coef2"]

IndexError: boolean index did not match indexed array along dimension 0; dimension is 8 but corresponding boolean dimension is 4

I believe ecco/svcca_lib/pwcca.py line 49 is supposed to be acts2 instead of acts1.

jalammar commented 2 years ago

Agreed it should probably show a clearer error, but this appears to be the behavior of the upstream pwcca code.