E3SM-Project / ACME-ECP

E3SM MMF for DoE ECP project
Other
9 stars 1 forks source link

No SP compsets #7

Closed brhillman closed 6 years ago

brhillman commented 6 years ago

We are currently lacking any SP-specific compsets, requiring SP configuration to be added by hand each time a new case is created. @whannah1 has a script that will do this, but ultimately we probably need to add additional compsets with SP to handle this. I think this is also necessary for adding SP tests to the test suite.

whannah1 commented 6 years ago

@brhillman, the SP2 configuration isn't going to change, so we can make a compset for that. The SP1 case is still up in the air until we figure out whether we can use RRTMG or not. I hope to settle this issue this week. The only thing I'm unsure about an SP compset is how to configure the CRM geometry. Do we want to make the default 64 columns? I'm tempted to go with this, but we might want to use less to speed up the testing. We might also want to use 2 km for crm_dx since this would allow a longer timestep. I'm actually trying to run 3 SP2 runs with 500, 1000, and 2000 m crm_dx to see how they compare, but they've been in the queue for a week, so I'm not sure we will have any useful information to answer this question. The other thing is the height of the CRM. I like using 58 levels for science-y reasons, but we might be able to get away with less, which would also speed up the model. Maybe we should just have a specific compset for testing that strips down the CRM to a minimum, and another one with more reasonable parameters for production runs?

brhillman commented 6 years ago

@whannah1 I think it's reasonable to make specific "testing" compsets to go with the full compsets where the CRM domain is reduced to that recommended in the regression testing page on confluence, since there doesn't seem to be an easy way to change the size of the CRM without digging pretty deep into CIME (and even then I'm not sure it's reasonable/possible in the CIME framework). I'm definitely in favor of making RRTMG work with SP1 and making that the default SP1 compset...I'm going to take a look at this as well this week. I think the fact that right now this is not quite working just highlights some things we need to fix in the SP/E3SM code to make things more flexible and test-able (and consistent).

Question about SP2: are you running this with ECPP right now, or without? I've had trouble running ECPP...the first time-step seems to hang forever and I cannot get it to run with my queue wallclock times, so I've just been turning ECPP off when running SP2.

Regarding the height, where does 58 levels get us in terms of altitude? I think @mt5555 had mentioned before you had trouble getting 70 levels to run, is that correct? Maybe we can cut that down for the sake of testing.

If you want, I can try to run the SP resolution tests locally if you send me the configuration.

whannah1 commented 6 years ago

The SP1+RRTMG thing isn't a thing that needs to be fixed. It's a totally new capability that hasn't been considered before for good reasons.

about SP2 - ECPP has never worked for me. The ECPP routine spits back NaN values that cause the model to crash. I spent some time looking into this a few months ago, but I couldn't trace it down, so I put it on the back-burner. I always leave it off in my runs so far.

I've never run higher than 58 levels. I made a table of vertical level information for two columns here on Confluence. The justification for 58 levels was because this kept the CRM height around 24-27km. Don Dazlich mentioned that the anelastic dycore of SAM would be problematic above 30 km, so this seemed like a good height. I haven't produced a long enough simulation to be able to see if there's any noticeable signature of the CRM height on the climate. Looking back at that table, I guess the most we could cut it down would be 56 levels, but that wouldn't save us very much so maybe we should leave it at 58.

I don't think you could run these resolution tests locally. I'm using ~10k tasks on titan and it takes 8 hours of wall clock time to run 5 days, since I'm using the ne30 grid. I know these configurations will run. I'm trying to look at how the simulations compare after a month to a year of simulation time.

brhillman commented 6 years ago

@whannah1 regarding SP1+RRTMG I know this is a "new" configuration, but both single-moment microphysics, and RRTMG existed separately before. My point was that the trouble with coupling these two different components into a new configuration points to issues regarding implementation of the separate components that could probably be smoothed out. I didn't mean to imply that this was a configuration that was working before that needed to be fixed, but rather that the fact that we can't just hook these up easily points to some things we can improve in the code to allow us to do such more easily. It makes sense to me to have a model framework where we can more easily plug in and test different schemes in the future without so much headache. Can you elaborate on the justification for not coupling SP1 with RRTMG? My understanding is that it's not a problem with the underlying RRTMG or SP1 codes, but just in the implementation and the fact that the RRTMG interface makes some assumptions it shouldn't make internally regarding the microphysics.

whannah1 commented 6 years ago

I'm still looking into it, but it seems we might need to find a new way to calculate the optical properties that doesn't rely on the explicit drop size distribution.

steveghan commented 6 years ago

Guangxing Lin has ECPP working on an older branch of ACME. Have you been in contact with him? Are you trying to update that branch?

What do you mean by SP2?

Steve

From: "Benjamin R. Hillman" notifications@github.com Reply-To: ACME-Climate/ACME-ECP reply@reply.github.com Date: Monday, October 9, 2017 at 11:19 AM To: ACME-Climate/ACME-ECP ACME-ECP@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [ACME-Climate/ACME-ECP] No SP compsets (#7)

@whannah1https://github.com/whannah1 regarding SP1+RRTMG I know this is a "new" configuration, but both single-moment microphysics, and RRTMG existed separately before. My point was that the trouble with coupling these two different components into a new configuration points to issues regarding implementation of the separate components that could probably be smoothed out. I didn't mean to imply that this was a configuration that was working before that needed to be fixed, but rather that the fact that we can't just hook these up easily points to some things we can improve in the code to allow us to do such more easily. It makes sense to me to have a model framework where we can more easily plug in and test different schemes in the future without so much headache. Can you elaborate on the justification for not coupling SP1 with RRTMG? My understanding is that it's not a problem with the underlying RRTMG or SP1 codes, but just in the implementation and the fact that the RRTMG interface makes some assumptions it shouldn't make internally regarding the microphysics.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/ACME-Climate/ACME-ECP/issues/7#issuecomment-335243902, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHuwB7YaYJjIsL7_L0AYMQeeRHD_grtjks5sqmNWgaJpZM4PyyVr.

whannah1 commented 6 years ago

I've been saying SP1 and SP2 as shorthand for "1-moment" and "2-moment" since they are very different configurations.

I haven't heard from Guangxing in awhile. Mikhail said he was going to look into the problem in the ACME branch, but I haven't heard anything from him either.

At this point we don't really need ECPP to be working, so we can revisit it later if no one takes charge of it.

steveghan commented 6 years ago

It is important that ECPP work, because it provides the coupling between the aerosols and clouds. Guangxing, can you tell this group what you have done to get ECPP to work? They are updating Mark Branson’s branch, so it is important that your ECPP mods get incorporated. Have you checked in your code to Mark’s branch?

Steve

From: Walter Hannah notifications@github.com Reply-To: ACME-Climate/ACME-ECP reply@reply.github.com Date: Monday, October 9, 2017 at 11:37 AM To: ACME-Climate/ACME-ECP ACME-ECP@noreply.github.com Cc: Steven J Ghan Steve.Ghan@pnnl.gov, Comment comment@noreply.github.com Subject: Re: [ACME-Climate/ACME-ECP] No SP compsets (#7)

I've been saying SP1 and SP2 instead of 1-moment and 2-moment.

I haven't heard from Guangxing in awhile. Mikhail said he was going to look into the problem in the ACME branch, but I haven't heard anything from him either.

At this point we don't really need ECPP to be working, so we can revisit it later if no one takes charge of it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/ACME-Climate/ACME-ECP/issues/7#issuecomment-335249857, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHuwBzAmv1Cee5VJlI_j3fO9J9vTNapzks5sqmeagaJpZM4PyyVr.

whannah1 commented 6 years ago

We're not working on Mark's branch anymore. We've created a fork based on the ACME master so we can track the ACME updates and bug fixes. If Guangxing has found bugs in the ECPP we should create a branch for it in the ECP repo and see if it fixes the NaN/segfault issue I was having on Titan. We should probably make a separate github issue for ECPP.

steveghan commented 6 years ago

Hi all,

I believe ECPP is working within the old ACME version (ACME V0), and I have checked in this working version to Mark’s branch. I think ECPP is not working only in the version with latest/new ACME, when Walter tried to update the host model (ACME) to the very recent ACME. Walter, correct me if I am wrong. I don’t have the code of MMF with the new ACME, so I don’t know what is going on there. But I am more than happy to help if someone gives me the code in which ECPP don’t work. Thanks!

Best, Guangxing

From: "Ghan, Steven J" Steve.Ghan@pnnl.gov Date: Monday, October 9, 2017 at 3:48 PM To: ACME-Climate/ACME-ECP reply@reply.github.com, ACME-Climate/ACME-ECP ACME-ECP@noreply.github.com, "Lin, Guangxing" guangxing.lin@pnnl.gov Cc: Comment comment@noreply.github.com Subject: Re: [ACME-Climate/ACME-ECP] No SP compsets (#7)

It is important that ECPP work, because it provides the coupling between the aerosols and clouds. Guangxing, can you tell this group what you have done to get ECPP to work? They are updating Mark Branson’s branch, so it is important that your ECPP mods get incorporated. Have you checked in your code to Mark’s branch?

Steve

From: Walter Hannah notifications@github.com Reply-To: ACME-Climate/ACME-ECP reply@reply.github.com Date: Monday, October 9, 2017 at 11:37 AM To: ACME-Climate/ACME-ECP ACME-ECP@noreply.github.com Cc: Steven J Ghan Steve.Ghan@pnnl.gov, Comment comment@noreply.github.com Subject: Re: [ACME-Climate/ACME-ECP] No SP compsets (#7)

I've been saying SP1 and SP2 instead of 1-moment and 2-moment.

I haven't heard from Guangxing in awhile. Mikhail said he was going to look into the problem in the ACME branch, but I haven't heard anything from him either.

At this point we don't really need ECPP to be working, so we can revisit it later if no one takes charge of it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/ACME-Climate/ACME-ECP/issues/7#issuecomment-335249857, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHuwBzAmv1Cee5VJlI_j3fO9J9vTNapzks5sqmeagaJpZM4PyyVr.

whannah1 commented 6 years ago

To my knowledge, ECPP has not worked in the 72 layer model since the beginning. To get the code just checkout the current ECP master branch.

brhillman commented 6 years ago

We should really have a separate issue to continue the ECPP dialogue, if for no other reason than to document the bug and eventual fix. @whannah1 do you want to open a separate issue, since you are more familiar with the bug in this branch? For what it's worth, I've also had trouble with ECPP, even back on the old CESM branch on 30 levels. The problem I had was that the model would hang at the first time-step at which the CRM/ECPP was called. I don't know if this is related to what you were seeing with the NaNs/segfaults or not.