Issue running the analysis using a separate script from that used for simulation

golosio commented 3 years ago

When running the analysis using a separate script from that used for the simulation (as I think would be a common approach when the simulation is done in a cluster) I encounter a problem. The method init() of the class Simulation can take as input argument sim_spec the label of a previous simulation. As I understand, those lines:

        if isinstance(sim_spec, dict):
            check_custom_params(sim_spec, self.params)
            self.custom_params = sim_spec
        else:
            fn = os.path.join(data_path,
                              sim_spec,
                              '_'.join(('custom_params',
                                        sim_spec)))
            with open(fn, 'r') as f:
                self.custom_params = json.load(f)['sim_params']

        nested_update(self.params, self.custom_params)

        self.network = network
        self.label = dicthash.generate_hash_from_dict({'params': self.params,
                                                       'network_label': self.network.label})

check if sim_spec is a dict, otherwise it is assumed is a string which is used to build the path were simulation data are stored. However the last line self.label = dicthash.generate_hash_from_dict(...) generates a new label, no matter if it was already specifed by sim_spec. The following code seem to work for me:

        if isinstance(sim_spec, dict):
            check_custom_params(sim_spec, self.params)
            self.custom_params = sim_spec
            self.label = None
        else:
            fn = os.path.join(data_path,
                              sim_spec,
                              '_'.join(('custom_params',
                                        sim_spec)))
            with open(fn, 'r') as f:
                self.custom_params = json.load(f)['sim_params']
            self.label = sim_spec

        nested_update(self.params, self.custom_params)

        self.network = network
        if self.label == None:
            self.label = dicthash.generate_hash_from_dict({'params': self.params,

AlexVanMeegen commented 3 years ago

Hey! Not exactly sure if I understand the issue correctly. sim_spec either takes the label (the hash) or a dictionary with parameters. If you pass exactly the same dictionary (both for network_spec and sim_spec) the same hash should be generated. The label (hash) takes all parameters into account, thus they all have to be identical.

golosio commented 3 years ago

I understand your point, however, I do not know why (I do not know how the hash generation algorithm works), when I run it on a different machine, a different hash is generated. For my need, the computing nodes of the cluster are suitable for the simulation, but not for the analysis, so I must do the latter on a different machine.

golosio commented 3 years ago

I have some problem with indentation when quoting the code, I hope it is clear anyway...

AlexVanMeegen commented 3 years ago

No worries, I think I got the gist. Essentially: Loading the dict from the simulation and generating a hash does not yield the same hash as in the simulation, right?

AlexVanMeegen commented 3 years ago

Just to be sure: do you specify network_spec and sim_spec both using their respective hash for the analysis?

jarsi commented 3 years ago

You can enclose blocks of code using 3 ``` instead of only 1 `.

The generate_hash_from_dict is defined in this package. The package states that it "can be safely used across different platforms." The function acts as a wrapper around the md5 function of the python builtin hashlib package. The way I understand the md5 algorithm is that it should always yield the same output given the same input.

golosio commented 3 years ago

This is my script:

import os
import sys
from multiarea_model import MultiAreaModel
from io import StringIO

"""
Load data from specified simulation, compute all measures
available in Analysis, save data.
"""

M = MultiAreaModel(network_spec='da4e0764b4a3d0c8a3d3687dfa9c5ae4',
                   simulation=True, sim_spec='3f3963181c5bc45a3f6e2a2143d53f16',
                   analysis=True)

M.analysis.create_pop_rates(t_min=1000.)
M.analysis.create_pop_rate_dists(t_min=1000.)
M.analysis.create_synchrony(t_min=1000.)
M.analysis.create_rate_time_series(t_min=1000.)
M.analysis.create_synaptic_input(t_min=1000.)
M.analysis.create_pop_cv_isi(t_min=1000.)
M.analysis.create_pop_LvR(t_min=1000.)

M.analysis.save()

golosio commented 3 years ago

and this is the folder with simulation data:

$ ls -1a ../data/3f3963181c5bc45a3f6e2a2143d53f16
.
..
Analysis
custom_Data_Model_da4e0764b4a3d0c8a3d3687dfa9c5ae4.json
custom_params_3f3963181c5bc45a3f6e2a2143d53f16
da4e0764b4a3d0c8a3d3687dfa9c5ae4_config
default_params.py
job_script_3f3963181c5bc45a3f6e2a2143d53f16.sh
Model.py
multiarea_model.py
recordings
run_simulation.py
simulation.py
VisualCortex_Data.py

As you can see, the labels seem to be correct, however if I run the unmodified code it generates a different hash and I receive a file-not-found error message.

FileNotFoundError: [Errno 2] No such file or directory: '/home/golosio/multi-area-model/data/e1f8a13b629acf19a4a0109ec76c5c2e/recordings/network_gids.txt'

If I use the code modified as I wrote above, the script seem to work properly...

golosio commented 3 years ago

Thank you @jarsi, I edited the previous comment using 3 "```" for the code block

AlexVanMeegen commented 3 years ago

Thanks, your code looks good to me. Since this check passes, at least the network labels seem to agree. Can you confirm this?

golosio commented 3 years ago

the network labels seem to agree. Can you confirm this?

Yes I confirm

golosio commented 3 years ago

In any case, I have to say that I really do not see the reason to re-generate a label (even if it should be the same) when it is provided as argument....

golosio commented 3 years ago

Could it be possible that when the params are saved and loaded from file, there is some difference, e.g. due to numerical precision, or the order how they are stored in a dictionary....? Or maybe they could be represented in a different way in different Python versions?

AlexVanMeegen commented 3 years ago

I think the reason why @mschmidt87 chose to regenerate the hash is to ensure that one uses consistent parameters. In my opinion, this is a sensible thing to do (and it was actually quite useful for me sometimes). If you disagree, I think your workaround is perfectly fine and does the job.

What bugs me is that the procedure works for the network label, which contains a lot of information (neuron params, connectivity, and a lot more), but not for the simulation label, which is actually quite simple (essentially just this, this, and this small dict). I see a lot more reasons why the network label could fail, i.e. many of the reasons that you listed, but if this works the simulation label should work as well. Since the network label is also regenerated, I currently tend to believe that there is not an issue with the hash generation library.

Nonetheless, if there is a bug it would be great to get rid of it - although I fully understand if you just opt for your workaround. Based on the assumption that the hash generation works fine, the only place I see where something could go wrong is in the update of the default params. Since the custom_params are loaded, they are the same and the only remaining candidate are the default params. Is it possible that there is a difference in default_params.py between the 'simulation machine' and the 'analysis machine'?

golosio commented 3 years ago

@AlexVanMeegen I agree with the motivation, i.e. ensuring that one uses consistent parameters, as you say. However in the current implementation the hash is simply regenerated, with no consistency check. The error comes later and is a file-not-found error, which is not very helpful for letting the user understand that is is due to an inconsistent configuration. Rather than using my workaround, a better solution would be to check consistency between the label provided as argument and the regenerated hash, and in case the check fails provide a helpful error message. For instance:

        self.label = dicthash.generate_hash_from_dict({'params': self.params,
                                                       'network_label': self.network.label})
        if ( (not isinstance(sim_spec, dict)) and sim_spec!=self.label ):
            raise ValueError('The configuration that produced the input simulation label does not match the current configuration')

You are right, the problem is that when I create an instance of the MultiAreaModel using labels, as in my script,

M = MultiAreaModel(network_spec='da4e0764b4a3d0c8a3d3687dfa9c5ae4', simulation=True, sim_spec='3f3963181c5bc45a3f6e2a2143d53f16', analysis=True)

the simulation parameters are initialized to their default values by default_params.py, rather than being loaded from some file stored in the simulation folder specified by the simulation label. Therefore, if I am not wrong, when the sim_spec argument is a label and not a dictionary, the regenerated hash is always the one corresponding to the default simulation parameters.

AlexVanMeegen commented 3 years ago

Fully agree, the file not found error is not helpful at all. With your approval, I'll bake your suggestion into a PR (I think there are multiple places where something like this applies) and I'll use this issue as a reminder.

I am not sure I understand the implications of your second point. In principle,

the non-default sim_params get dumped during the sim and are loaded based on the hash for the analysis, then
the loaded non-default sim_params are used to override the default sim_params, and finally
the hash is (re-)generated.

You wrote that you have a custom_params_3f3963181c5bc45a3f6e2a2143d53f16 file in the data directory. Does it contain the (non-default) sim_params that you specified for the simulation? Since you are getting the hash corresponding to the default sim_params it seems that the problem is related to these 'custom' sim_params. I could imagine that at some point, something went wrong in with the 'custom' parameter update and then the default parameters were dumped, thereby overriding the initially used 'custom' sim params. If this is correct, you could fix it by passing once the dict with the actually used sim params instead of the hash in the analysis. This should dump the custom params and correct the corrupted file. Afterwards, using the hash should work again.

golosio commented 3 years ago

I just found where my problem is... Is the master_seed parameter. Indeed, as you say, the other simulation parameters are correctly loaded based on the hash. Sorry, maybe the problem is that the way I changed the seed for random number generation is not the proper one. To run a bunch of simulations with different seeds, I changed the master seed directly in the file default_params.py through a bash script. Unfortunately this value is not stored with the other parameters in the simulation parameter file, and in the analysis it is always set to its default value (0).

golosio commented 3 years ago

Is there a proper way to change the seed for random number generation?

AlexVanMeegen commented 3 years ago

Great, this sounds like a good candidate for causing the issue!

One way to set different seeds would be to replace the end of the fullscale example with

for seed in my_seeds:
    sim_params['master_seed'] = seed
    M = MultiAreaModel(network_params, simulation=True,
                       sim_spec=sim_params)
    start_job(M.simulation.label, submit_cmd, jobscript_template)

This would submit a job for each seed.

To fix your existing simulations: maybe manually including the used seed as a master_seed in custom_params_simlabel could work.

jarsi commented 3 years ago

Glad you found the error. In general the values in the default_params file should not be changed. Rather, as @AlexVanMeegen also suggested, they should be changed in the custom sim_params and network_params dictionaries. We could issue a warning if the user changes something in these dictionaries. For example we calculate the hashes of network_params and sim_params extracted from the default_params.py and compare them against fixed, baseline values. If they do not agree we know that the user has changed the parameters in the wrong place and let the user know via failing and printing an error.

golosio commented 3 years ago

Thank you @AlexVanMeegen and @jarsi! I'll follow your suggestions about the seed.

With your approval, I'll bake your suggestion into a PR

Sure

AlexVanMeegen commented 3 years ago

@golosio, does this solve the issue or is there anything remaining?

Regarding the safeguarding of the code: what about using git to ensure that there are no changes? The guys from Graz build something like this into their SimManager. Then it would not only be based on the hashes but also on the code.

golosio commented 3 years ago

Sure, the root of the problem was the change that I made to the seed in the default params file. Concerning the regeneration of the hash, I still think that you should implement the check in a PR, but I think this issue can be closed. I always use git and I often combine it with bash to check changes in the code. The problem is that I thought that the change to the seed in the default params file was safe, I did not expect that it could create problems... PS you should have received an email from me

INM-6 / multi-area-model

Issue running the analysis using a separate script from that used for simulation #24