Closed bjsilver closed 3 years ago
I ran a successful test with the default WRFotron the other day.
I thought those messages in diurnal_emiss.out
were a quirk of the WRF_UoM_EMIT
because we didn't have sector information for those sectors (e.g., aircraft, etc.). If you run python plotwrfchemi.py
, is the diurnal cycle applied to total emissions or not?
I also see a similar message in the rsl.error.0000
file, so I'm not sure how this is connected to the MPI_ABORT
error you're getting. What's the path to this run folder and I could take a look (you'll need to give me read and execute access all the way down,+rx)?
Hi Luke, thanks for getting back to me. I ran plotwrfchemi.py and it showed the diurnal emissions. Looks like the diurnal emissions stage is fine in that case. (probably going to be some dumb mistake from me, sorry)
Thanks very much for having a look, here is the path:
/nobackup/eebjs/simulation_WRFChem4.2_test/run/base/2015-10-11_18:00:00-2015-10-13_00:00:00
No worries. I think this might be a memory error. Try a test for main.bash
with 2GB per core (i.e. #$ -l h_vmem=2G
), and for this 24 hour test run you can probably decrease the wall clock time to shorten the wait in the queue (i.e. #$ -l h_rt=04:00:00
). This has been happening occasionally, so I'll increase this memory in the default WRFotron. Let me know how it goes.
Thanks Luke will submit that now and get back to you. It would make sense if it is a memory issue, because I didn't get this crash initially, but then I made some changes to the domain and it started happening, so I went back to the clean version and it was still happening
Okay, that makes sense, especially if timesteps (resolution to timestep ratios) changed.
Hi Luke, similar crash happened on hour 13 at h_vmem=2G. Will try again with 4G
At h_vmem=4G all the wrfouts are created but I still get the MPI_ABORT error in main.bash.e and the first wrfout file (at the beginning of the 6hr met spinup) is less than half the size of the others so possibly something went wrong there
Okay. Could you make the simulation path readable again and I'll take a look?
Ok just done that (I think)
Did you update the executable access too, as I can't see it i.e. chmod a+rx -R /nobackup/eebjs/simulation_WRFChem4.2_test
?
thanks Luke, done it now
Well, I think the MPI_ABORT
message might be associated with that smaller first hour of spin-up. Though the remainder of the spin-up looks fine, and the whole of the spin-up is discarded anyway as its only purpose to set reasonable initial conditions for the simulation. I'm not sure if this is related to hardware, as there is not much information to go on. The rest of the simulation looks okay.
Hello
I recently cloned a clean WRFotron repo (after the recent bug fixes) and tried to run the default domain/time. I had some errors which seem to lead to main crashing, and I think they may be related to the latest bugfix. Tried to work out what is going on without success
In pre.bash all log files are fine except diurnal_emiss.out, which has the error
I also noticed this message in the file:
(0) No diurnal cycle applied to the following emission variables, because of lack of sector information (was this intended?):
Here is the full file: diurnal_emiss.outWhen main starts, it created the first wrfout file fine, then crashed on the second one. In the rsl files there is the error:
which I think is what causes this error to show up in main.bash.e*
Does anyone know what might be causing the diurnal_emiss error and whether this is causing the crash in main? Also, has anyone run the test case since the bug fix and does it work ok for you? Could be an issue at my end.
Cheers