Closed aclarkData closed 4 years ago
This probably doesn't fix the issue- but I do not believe this is run 0.4.21 (Looking at your economyconfig.py, still imports append_configs instead of Experiment)
@matttyb80 Were you on the cadCAD_upgrade branch? It looks like I'm using experiment? https://gitlab.com/grassrootseconomics/cic-modeling/-/blob/cadCAD_upgrade/Simulation/model/economyconfig.py
I see it now. A stray click switches you to master branch in git lab, I see.
I did run it in the new branch and am getting unique results.
That is so weird. I just "pip freeze | grep cadCAD cadCAD==0.4.21"
And I'm still getting the same results. I wonder why it isn't working on my machine?
@aclarkData @matttyb80 Python versions?
Python 3.8.3
@aclarkData Linux @matttyb80 Windows
I'm seeing variability within monte Carlo runs when using parameter sweeps. When I'm NOT using param sweeps, I'm not seeing the variability.
I'm having the same issue when I run the CIC notebook
MacOS 10.14.6
Python 3.7.5
cadCAD 0.4.21
This comment is probably related (notebook)
I recreated the issue here I think, with minimal code, and printed some extra debug details too: https://gist.github.com/BenSchZA/72b1b0a529703e97cd4e53f200e1516b
macOS 10.15.6 Python 3.8.5 cadCAD 0.4.21
@aclarkData @matttyb80 @markusbkoch @BenSchZA
Solved (inversely) on my end @aclarkData I need creds to push the results, msg me
comment out the plotting functions in simulation/model/parts/initialization.py
# pos = nx.spring_layout(network, pos=nx.get_node_attributes(
# network, 'pos'), fixed=nx.get_node_attributes(network, 'pos'), seed=10)
# nx.draw(network, node_color=color_map,
# pos=pos, with_labels=True, alpha=0.7)
# plt.savefig('images/graph.png')
# plt.figure(figsize=(20, 20))
# plt.show()
I commented it out initially to solve the problem in Simulation/issue_186.py
without seeing plots or using the notebook after isolating the problem in the notebook.
this caused me not to see the erroneous results initially
un-commenting caused the issue
Therefore said block of code is the issue (after collegiate confirmation)
suspected a mutation and literally guessed the solution to a problem i didn't see
this is result before i un-commented
Total execution time: 52.77s
Run one: 1308 1.995921
1309 1.995921
1310 1.995921
1311 1.995921
1312 0.663139
Name: VelocityOfMoney, dtype: float64
Run two: 1409 0.591588
1410 0.591588
1411 0.591588
1412 0.591588
1413 1.390478
Name: VelocityOfMoney, dtype: float64
Run three: 1510 1.084336
1511 1.084336
1512 1.084336
1513 1.084336
1514 3.002238
Name: VelocityOfMoney, dtype: float64
waiting to push given creds
@aclarkData Why is this causing a mutation?
I recreated the issue here I think, with minimal code, and printed some extra debug details too: https://gist.github.com/BenSchZA/72b1b0a529703e97cd4e53f200e1516b
macOS 10.15.6 Python 3.8.5 cadCAD 0.4.21
@JEJodesty this minimal example has the same issue (as far as I can see), and doesn't seem to have anything out of the ordinary that could be causing mutations.
@BenSchZA @aclarkData @matttyb80 @markusbkoch second scenario solved
@markusbkoch @BenSchZA
It was multi-processing
removed it and kept multi-threading and it worked
multi-threading is the minimal need
cadCAD/engine/execution.py
# pp = PPool()
# results = flatten(list(pp.map(lambda params: threaded_executor(params), new_params)))
results = flatten(list(map(lambda params: threaded_executor(params), new_params)))
# pp.close()
# pp.join()
# pp.clear()
# pp.restart()
Total execution time: 0.03s
Run one:
+----+----------+--------------+----------+-------+-----------+------------+
| | a | simulation | subset | run | substep | timestep |
|----+----------+--------------+----------+-------+-----------+------------|
| 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| 1 | 5.52521 | 0 | 0 | 1 | 1 | 1 |
| 2 | 5.68298 | 0 | 0 | 1 | 1 | 2 |
| 3 | 12.7897 | 0 | 0 | 1 | 1 | 3 |
+----+----------+--------------+----------+-------+-----------+------------+
Run two:
+----+----------+--------------+----------+-------+-----------+------------+
| | a | simulation | subset | run | substep | timestep |
|----+----------+--------------+----------+-------+-----------+------------|
| 4 | 0 | 0 | 0 | 2 | 0 | 0 |
| 5 | 0.639853 | 0 | 0 | 2 | 1 | 1 |
| 6 | 1.80996 | 0 | 0 | 2 | 1 | 2 |
| 7 | 6.69687 | 0 | 0 | 2 | 1 | 3 |
+----+----------+--------------+----------+-------+-----------+------------+
Run three:
+----+----------+--------------+----------+-------+-----------+------------+
| | a | simulation | subset | run | substep | timestep |
|----+----------+--------------+----------+-------+-----------+------------|
| 8 | 0 | 0 | 0 | 3 | 0 | 0 |
| 9 | 6.39755 | 0 | 0 | 3 | 1 | 1 |
| 10 | 12.9613 | 0 | 0 | 3 | 1 | 2 |
| 11 | 17.5493 | 0 | 0 | 3 | 1 | 3 |
+----+----------+--------------+----------+-------+-----------+------------+
@BenSchZA @aclarkData @markusbkoch @matttyb80 There will be a branch for beta tests in the future
@BenSchZA Your on the team
Thanks @JEJodesty - can confirm that solves it: https://gist.github.com/BenSchZA/c8aef315c2c25e347e2d0cd6fe489eed
Yeah, I was gonna say randomness is tricky, all this under the hood, environment-dependent stuff leaves too much room for situations like this.
I'd argue that the only definitive way around this might be to explicitly seed the random number generator of each run with different seeds. See example implementation in this fork from Ben's gist: https://gist.github.com/markusbkoch/135bf69e6361d3a3f84e96b6be6df971
Yeah, I was gonna say randomness is tricky, all this under the hood, environment-dependent stuff leaves too much room for situations like this.
I'd argue that the only definitive way around this might be to explicitly seed the random number generator of each run with different seeds. See example implementation in this fork from Ben's gist: https://gist.github.com/markusbkoch/135bf69e6361d3a3f84e96b6be6df971
Spot on, similar threads for same issue: https://stackoverflow.com/questions/12915177/same-output-in-different-workers-in-multiprocessing
@BenSchZA @aclarkData @markusbkoch @matttyb80 hot-fixing single param scenario today and publishing 0.4.22
@aclarkData your "plotting" issue is still a concern user side i have to prioritize issues with cadCAD at the moment
@aclarkData @matttyb80 Its strange that this worked in windows I will be getting windows as well as linux
solution in this branch pending merge to master by end of day https://github.com/cadCAD-org/cadCAD/tree/issue_186
@JEJodesty Please let me know when it has been merged. Thanks
Hot Fixed https://pypi.org/project/cadCAD/0.4.22/
note:
generating random numbers by using:
from numpy import random random.RandomState().random()
is enough to mitigate the issue
ref: https://stackoverflow.com/questions/12915177/same-output-in-different-workers-in-multiprocessing
As can be seen in the following simulation result, when running monte carlo runs with cadCAD 0.4.21, when subsetting runs by the run column, there is no difference between the runs, even when stochastic processes are present. This issue has been present since cadCAD 0.4.17 I believe. Please help
https://gitlab.com/grassrootseconomics/cic-modeling/-/blob/cadCAD_upgrade/Simulation/CIC_Network_cadCAD_model.ipynb