Abnormal phenomenon of reservoir state and 'balanced_accuracy_score' in tutorial.py

netneurolab / conn2res

A reservoir computing toolbox for neuroscientists

https://github.com/netneurolab/conn2res

BSD 3-Clause "New" or "Revised" License

82 stars 20 forks source link

Abnormal phenomenon of reservoir state and 'balanced_accuracy_score' in tutorial.py #44

Open YuZe-01 opened 8 months ago

YuZe-01 commented 8 months ago

Hi,

When I ran the tutorial.py with 'PerceptualDecisionMaking', the output reservoir state is quite starnge that it seems the value of each node doesn't change.

Meanwhile, the performance of 'balanced_accuracy_score' also had a difference with paper, which had a drop of nearly more than 10%.

I just changed the 'adjusted' in 'balanced_accuracy_score' to 'True' and changed the file path of 'connectivity.npy', 'cortical.npy' and 'rsn_mapping.npy' to 'examples\data\human', which were downloaded with the guidance of closed issues 'Where can I get the three files connectivity.npy, cortical.npy, rsn_mapping.npy?'

bachnguyenTE commented 8 months ago

I'll do a verification run to test, but on my current version of the code, I haven't seen the issue. Can you let me know which version of the code you are running on, and have you check and ensure that the connectivity data is correctly formatted and linked to the right function?

YuZe-01 commented 8 months ago

I downlaoded the code based on readme file last week, I think it is the newest version.

git clone https://github.com/netneurolab/conn2res.git
cd conn2res
pip install .
cd ..
git clone -b v0.0.1 https://github.com/neurogym/neurogym.git
cd neurogym
pip install -e .

The connectivity data is a very sparse and symmetric matrix, I downloaded data.zip from https://zenodo.org/record/4776453#.Yd9AuS_72N8 and selected:

connectivity.npy ---> connectivity/individual/human_500.npy cortical.npy ---> cortical/cortical_human_500.npy rsn_mapping.npy ---> rsn_mapping/rsn_human_500.npy

in data.zip file and put them under the path:

examples/data/human/connectivity.npy

In addition to this, I didn't do anything else in tutorial.py

# load connectivity data of one subject
conn = Conn(subj_id=0)

# scale conenctivity weights between [0,1] and normalize by spectral its
# radius
conn.scale_and_normalize()

# instantiate an Echo State Network object
esn = EchoStateNetwork(w=conn.w, activation_function=activation)

bachnguyenTE commented 8 months ago

Thank you for clarifying. I have also run the same simulation on my local machine and I did also get the same results as yours. I tried graphing with detailed alpha values, and it turned out that with different alpha values, you will get different reservoir dynamics: the dynamics you got are from alpha=1, which supposedly should be on edge-of-chaos, but slight floating-point precision may have offset it to be slightly higher than 1.0 and caused the breakdown in reservoir dynamics. For your reference, I attached the images for alpha=0.95 and alpha=1.0 so you can see the clear transition in dynamics:

For your information, you can refer to the ESN's "Echo-state Property" for an explanation of the reservoir's chaotic dynamics. Let me know if you have any questions!

YuZe-01 commented 8 months ago

Thanks for your guidance. I also replicate your result. Here is some differences. But maybe because our tasks are different, your task is 'PerceptualDecisionMakingDelayResponse' and my task is 'PerceptualDecisionMaking'.

But I am still confused that the output of reservior's state and final performance(balanced_accuracy_score, tanh:75%, sigmoid:65%) is far away from the paper(balanced_accuracy_score, tanh:85%, sigmoid:80%). I believe that the final performane(f1 score or balanced_accuracy_score) won't be influenced by alpha. But here in this two-alternative forced choice task, 60% or 70% is relatively low and very closed to 50%.

Meanwhile, I also find that in tutorial.ipynb, the reservior's state is drawn when alpha=1.

Do you think this is caused by the difference of source data(connectivity.npy)?

Deskt0r commented 8 months ago

Hello, I have recently cloned the repository too and took the same steps as @YuZe-01, except that I installed conn2res in develop mode with pip install -e ..

For me, the behavior initially described by @YuZe-01 also happens in tutorial.ipynb, I have not tested tutorial.py yet. For alpha=1 it is Screenshot 2024-03-07 at 21-08-33 tutorial

When I run it with alpha=0.95 I get

In my case, the scores are similar to the ones in the repository, but sometimes almost the same, but perhaps that is normal. For alpha=0.95 for example:

@bachnguyenTE Have you used a seed? If so, could you tell us which?

bachnguyenTE commented 8 months ago

I would assume that what you guys have seen from a paper is only a sample run of the repository. I have extensively experimented with models in the repository, and I can say that the model is highly unstable. Unless you have a lot of control over your RNGs or you have air-tight data sampling, it would be hard to get reproducible results.

@YuZe-01, the nature of echo-state models and reservoir models, in general, is that they are almost entirely dependent on the reservoir. Particularly you have to maintain the "echo-state property": you might have seen that from alpha=1 onwards the dynamics of the reservoir collapsed, meaning you won't be able to extract any meaningful information from those "flat-line" dynamics. It is theoretically and experimentally proven that any alpha>1 is going to result in models with sub-par performance. If your performance is not affected you might have a problem with the input or readout layer of the reservoir layer, so make sure to double-check the reservoir dynamics and those layers.

@Deskt0r, the graph that you got is what I mentioned, so double-check your readout and reservoir dynamics as well. I didn't use any seed to reproduce, and my performance is also a bit different from what you see in the paper. They might have a different reservoir producing these results, or it might purely be a difference in hardware configuration or the parameters of the readout ridge regression layer. I attached an image of over 3200 runs that shows you guys how much variance you'd get. The variance of the ridge readout is generally insane, so if any of you guys have improvements in mind, I'd love to hear them!

estefanysuarez commented 8 months ago

Hey guys,

Thanks @bachnguyenTE for your prompt reply to this issue!

I'm not sure if the differences you are observing might be due to the data. The data that I used for the figures in the paper are the ones specifically in this repo on Zenodo: https://zenodo.org/records/10205004

Maybe try with this data, and see how it goes. Download the folder in the Zenodo repo, and place it inside the folder of the code repo.

In any case, as @bachnguyenTE mentioned, reservoir dynamics tend to vary a lot, specially around alpha~1.0. It is normal and expected that reservoir dynamics vary for different values of alpha. In reservoir computing it is always good practice to run simulations several times to get a distribution of values, rather than a single value.

@YuZe-01 and @Deskt0r, I would suggest try with this new data and make sure that the arguments of all functions are exactly the same as in the tutorial!

Good luck :)

YuZe-01 commented 7 months ago

@estefanysuarez Sorry for replying you so late. I have downloaded the new data according to the link you gave.

I ran the 'PerceptualDecisionMaking' task for 100 times and the final performance is shown below.

One of the reservior state is shown below with alpha=0.9. The reservior's state of each run is different actually because the model is highly unstable as @bachnguyenTE mentioned before.

And I also find an interesting thing, because as I know, the connectivity matrix is always positive number, the 'conn.scale_and_normalize()' function won't change it into negative number, in 'esn.simulate()', the code just add like this synap_input = np.dot(self._state[t-1, :], self.w) + np.dot(ext_input[t-1, :], w_in) according to this task's input is always positive, so the value before being input into activation function will not be negative, right? but for 'tanh', it will map the value to negative number when x is negtive number and positive number when x is positive number, so I think the reservior's state will never be like the picture shown in the paper, because there're negative number in the picture.

estefanysuarez commented 7 months ago

Hi @YuZe-01 , I apologize for my late reply too! The reason why there are negative values in the reservoir states of the figure in the paper is because I scaled the values between -1 and 1. I just noticed that the lines that scale the data are commented in the code (see lines 376-378 in the plot_reservoir_states function of the plotting.py module. I'm not sure why they are commented, but if you uncomment them, you'll see a similar plot as the one in the paper. This is something that has to be fixed with an argument to the function that says whether the user wants to scale the data or not. Thanks for pointing this out!