GeoscienceAustralia / eqrm

Automatically exported from code.google.com/p/eqrm
Other
5 stars 4 forks source link

join_parallel_files_column throws exception when fatality run_type run in parallel #5

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Find a run_type="fatality" simulation (e.g. TS_fat01)
2. Run in parallel using mpirun
3. Error thrown in join_parallel_files_column

What is the expected output? What do you see instead?

Exception seen:

Traceback (most recent call last):
  File "implementation_tests/scenarios/TS_fat01.py", line 72, in <module>
    main(locals())
  File "/nas/gemd/georisk_models/earthquake/sandpits/ben/eqrm/trunk/eqrm_code/analysis.py", line 713, in main
    compress=eqrm_flags.compress_output)
  File "/nas/gemd/georisk_models/earthquake/sandpits/ben/eqrm/trunk/eqrm_code/output_manager.py", line 1223, in join_parallel_files_column
    f=my_open(base_name,'w')
TypeError: coercing to Unicode: need string or buffer, list found

Please use labels and text to provide additional information.

Caused by logic on in analysis.py on line 700:

        files = save_distances(eqrm_flags, sites=all_sites,
                               event_set=event_set,
                               compress=eqrm_flags.compress_output,
                               parallel_tag=parallel.file_tag)
        column_files_that_parallel_splits.append(files)

Files is a list, so it appends a list onto a list. e.g.

column_files_that_parallel_splits == 
['./implementation_tests/current/TS_fat01/java_fatalities.txt', 
['./implementation_tests/current/TS_fat01/java_distance_rjb.txt', 
'./implementation_tests/current/TS_fat01/java_distance_rup.txt']]

If the list is extended, the problem goes away. e.g.

column_files_that_parallel_splits.extend(files)

Original issue reported on code.google.com by b...@girorosso.com on 13 Feb 2012 at 10:43

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
It looks like there is an issue with generating the fatalities file in 
parallel. If I comment out the code that causes this particular issue 
(distances files), run TS_fat01.py using 4 nodes and compare the fatalities 
file output with the standard it is different.

Investigating.

Original comment by b...@girorosso.com on 17 Feb 2012 at 3:11

GoogleCodeExporter commented 9 years ago
The issue described above is due to the change in random generation of events 
that running in parallel brings. If atten_variability_method is set to 
something other than 2 (random sampling), then the results are the same whether 
run in parallel or not.

The fix as described in the issue description will be implemented.

Original comment by b...@girorosso.com on 17 Feb 2012 at 5:26

GoogleCodeExporter commented 9 years ago
Resolved in revision 946

Original comment by b...@girorosso.com on 17 Feb 2012 at 5:28