Closed alaniwi closed 1 year ago
Matrix of combinations (tested via command line):
So the difference is in the horizontal position rather than the height coordinates of the release.
although subject to an issue re segfaults that I will comment on below
I note here that the first time I ran test2 above, it gave this segmentation fault:
[cwps@ceda-wps-staging no-plots]$ ./run_name.sh inp_test2
[ceda-wps-staging.ceda.ac.uk:16838] OPAL ERROR: Error in file pmix2x.c at line 326
[ceda-wps-staging.ceda.ac.uk:16838] OPAL ERROR: Error in file pmix2x.c at line 326
[ceda-wps-staging.ceda.ac.uk:16838] OPAL ERROR: Error in file pmix2x.c at line 326
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
nameiii_64bit_par 000000000086C2A1 tbk_trace_stack_i Unknown Unknown
nameiii_64bit_par 000000000086A3DB tbk_string_stack_ Unknown Unknown
nameiii_64bit_par 0000000000812844 Unknown Unknown Unknown
nameiii_64bit_par 0000000000812656 tbk_stack_trace Unknown Unknown
nameiii_64bit_par 00000000007A5709 for__issue_diagno Unknown Unknown
nameiii_64bit_par 00000000007ABAB6 for__signal_handl Unknown Unknown
libpthread-2.17.s 00002B5D9B2A9630 Unknown Unknown Unknown
mca_pmix_pmix2x.s 00002B5D9FC473A7 pmix2x_value_unlo Unknown Unknown
mca_pmix_pmix2x.s 00002B5D9FC4700F pmix2x_event_hdlr Unknown Unknown
mca_pmix_pmix2x.s 00002B5D9FC61198 pmix_invoke_local Unknown Unknown
mca_pmix_pmix2x.s 00002B5D9FC66177 Unknown Unknown Unknown
mca_pmix_pmix2x.s 00002B5D9FC659BA Unknown Unknown Unknown
mca_pmix_pmix2x.s 00002B5D9FCD36EC pmix_ptl_base_pro Unknown Unknown
libopen-pal.so.40 00002B5D9DC07782 opal_libevent2022 Unknown Unknown
mca_pmix_pmix2x.s 00002B5D9FCA5A22 Unknown Unknown Unknown
libpthread-2.17.s 00002B5D9B2A1EA5 Unknown Unknown Unknown
libc-2.17.so 00002B5D9B5B4B0D clone Unknown Unknown
This could not be reproduced when re-running it a further 10 times (actually slightly more, given one or two interactively outside of this retry loop).
The following comment was found by googling for the error message: https://github.com/open-mpi/ompi/issues/5336#issuecomment-400490216 , so may be an issue with a similar cause, and refers to failure rates in the region of 1-3%. So probably I have not retried it enough times to reproduce this intermittent failure. I will guess here that this is an issue independent of the one that this issue is about, and one that we ought to look at, but I will create a separate issue about it and then otherwise ignore it for the purpose of the current issue.
Seems to be related to the longitude value. Tried a couple of start dates and that didn't make a difference. 45E is fine, 50E or 60E there is no output. (50W was okay.) Will have to ask Andrew.
It turns out that the hard-coded model domain is the issue here. Need the same fix as at https://github.com/cedadev/swallow/issues/58#issuecomment-1307157651 and then change the hard-coded values to be global - see https://github.com/cedadev/swallow/blob/d62202860204e25b4c9ce9b36db63deb46ae5e5c/swallow/processes/create_name_inputs/make_traj_input.py#L14-L17 , but still pass them through from the Python. This should fix the issue, while retaining the option of adding user-specified computational domain in future (as is already implemented for the general forward / air history runs).
This was fixed. Here's a plot from the previously broken test case.
Test case with trajectory run at Lat=34, lon=45, heights=55,70 produced a plot
but with Lat=40, lon=50, heights=200,300 did not (it did not produce a full time series of output, only the first time, although exited with apparent success)
both runs were forward trajectories, initialised on 2022-01-01 00:00:00 - same symptoms if 12 hour or 48 hour run
Investigate why there is a difference