MPAS-Dev / MPAS-Analysis

Provides analysis for the MPAS components of E3SM
Other
54 stars 52 forks source link

Cannot use analysis package until first successful restart write #119

Closed vanroekel closed 6 years ago

vanroekel commented 7 years ago

In conducting short sensitivity experiments with ACME, it was not possible to use the analysis software as is, since it depends on a valid restart file. In ACME the file is not written until the end of the first queue submission. The dependence on a restart file is too strict. My hack-fix of this issue is to change the restart to and output file, but this is not sufficient. It seems like the best long term solution is something similar to the variableMap capability. Check for a restart, if none exists, try an output file, if non exists try the IC mesh file for that simulation

milenaveneziani commented 7 years ago

@vanroekel: since we are moving away from writing mesh information from output files, I am not sure it makes sense to add that option. It would work for now, but not for the hist.am.timeSeriesStats files for example: only for the instantaneous history files.

xylar commented 7 years ago

@vanroekel, I think an output file will not always (or even by default) contain the information we will need for analysis. As @milenaveneziani says, these files may not even contain the mesh by default in the future. For now, they don't contain refBottomDepth, which is needed by OHC.

I think what we would need to do is check for a restart file and then, if none is found, check both the mesh and input files for the fields we need (i.e. mesh variables in mesh, depth-dependent variables like refBottomDepth in the input file). If none of those exists, we could always check for an output file but I think that's unlikely to be a robust option.

vanroekel commented 7 years ago

@xylar and @milenaveneziani Thanks for the explanation. I certainly see my solution is not a good fix. In fact using the output file only worked for the horizontal climatology plots. If there is a standard mesh assigned for each case in the input-data directory, could this be utilized in place of the restart file? The one thing I think missing there is the simulationStartTime (also missing from output).

milenaveneziani commented 7 years ago

@vanroekel: yes, using the mesh file was our original solution to this, but since we often run the analysis on simulation data that is transferred from the machine where the simulation was originally run, we started running into the problem that the mesh file was not readily available at all in those cases. But I think we can put it back, have it as a first option, and then fall onto the restart file if the mesh file is not available.

The fact that simulationStartTime is not available is a major problem though.

xylar commented 7 years ago

The fact that simulationStartTime is not available is a major problem though.

Yes, I agree that's a potential problem. In certain circumstances, simulationStartTime is available in the namelist file. But that's fragile enough to be a backup option at best.

What it comes back to is that there is no one reliable place to get the information we need besides a restart file.

@vanroekel, is there something preventing you from writing out a restart file at the same time as your output file for the use case you're trying to work with?

vanroekel commented 7 years ago

@milenaveneziani , my recollection on the IC not having simulationStartTime was not correct. I looked through the v3 files in the input data directory for the ocean and the variable is there.

@xylar I could certainly do that, but if I did I would have to write restarts for all output files correct? I'm not sure how else I could write one restart with the first output, but continue for more months (or even years). It seems this would increase IO unnecessarily.

I will say again that I have a hacked fix that works for me, and if I'm the only person with this use case, I'm happy to close the issue.

milenaveneziani commented 7 years ago

If I remember correctly, simulationStartTime is present if the IC is from a spinup case, but it is not there if we are starting from climatology.

vanroekel commented 7 years ago

are there any cases where we start from pure climatology? I think every IC I'm aware of has a stand-alone MPAS-spinup and the variable should be there.

xylar commented 7 years ago

If I remember correctly, simulationStartTime is present if the IC is from a spinup case, but it is not there if we are starting from climatology.

Yes, and there's no guarantee that the simulationStartTime in the spinup is the same as the one in the main simulation.

are there any cases where we start from pure climatology? I think every IC I'm aware of has a stand-alone MPAS-spinup and the variable should be there.

Eventually, we want to be able to use this for test cases that aren't ACME, global runs. These will definitely not have simulationStartTime in their initial conditions.

xylar commented 7 years ago

@xylar I could certainly do that, but if I did I would have to write restarts for all output files correct? I'm not sure how else I could write one restart with the first output, but continue for more months (or even years). It seems this would increase IO unnecessarily.

I don't think I quite understand your use case. It sounds like you're wanting to use the analysis to check your first set of results while the simulation is running, but well before you've written out your first restart file, is that right?

vanroekel commented 7 years ago

You are right. For the G-case test, I submit a 5 year (from the initial condition, no restarts available) and want to check the simulation during that 5 year period prior to the first restart. As you suggest, I could write restarts more frequently, but don't want to go to writing one per month.

xylar commented 7 years ago

@vanroekel, okay, good to know. I think that's definitely a use case we want to support. The rest of us will undoubtedly also run into these kinds of cases, particularly when we start using idealized test cases.

xylar commented 7 years ago

@milenaveneziani and @vanroekel, I think it should be possible to fall back on reading mesh fields from the mesh file if no restart file exists, and reading the simulation start time from config_start_time, which should contain the same thing as simulationStartTime in this case. (config_start_time = 'file' if we're doing a restart, but then we would have a restart file.) We will still likely run into trouble in cases where results have been copied to a new location but no restart file was included. As long as the error message is clear when neither the restart file nor the mesh file is present, I think we'll be okay.

milenaveneziani commented 6 years ago

we should decide what to do with this issue.

xylar commented 6 years ago

At this point, I don't think the analysis can be run usefully with less than 1 year of data. Climatologies are not computed at all, for example. I am inclined to close this with "won't fix" because we don't have a clean solution and it doesn't seem like a high priority. Presumably @vanroekel's hack of linking to an output file until a restart file is available will do in a pinch.