NOAA-GFDL / MOM6-examples

Example configurations for MOM6 and SIS2
Other
86 stars 145 forks source link

combine the refineDiag scripts under tools/analysis into one #12

Open nikizadehgfdl opened 9 years ago

nikizadehgfdl commented 9 years ago

Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.

The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.

We have run into this limitation in our CM4 runs.

Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.

jkrasting commented 9 years ago

Not good news! I agree - waiting for a FRE/MOAB fix could take too long. That's the preferred solution, so I don't think we should give up on that option even if we find a band-aid ourselves.


Dr. John Krasting Physical Scientist (NOAA Federal) NOAA/Geophysical Fluid Dynamics Laboratory Biogeochemistry, Ecosystems, and Climate Group Princeton University Forrestal Campus 201 Forrestal Road Princeton, NJ 08540

P. (609) 452-5359 F. (609) 987-5063

On Tue, Dec 9, 2014 at 6:50 PM, Niki Zadeh notifications@github.com wrote:

Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.

The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.

We have run into this limitation in our CM4 runs.

Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.

— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12.

adcroft commented 9 years ago

How many refineDiag scripts did we have and how many will we have?

Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540

On Wed, Dec 10, 2014 at 7:31 AM, John Krasting notifications@github.com wrote:

Not good news! I agree - waiting for a FRE/MOAB fix could take too long. That's the preferred solution, so I don't think we should give up on that option even if we find a band-aid ourselves.


Dr. John Krasting Physical Scientist (NOAA Federal) NOAA/Geophysical Fluid Dynamics Laboratory Biogeochemistry, Ecosystems, and Climate Group Princeton University Forrestal Campus 201 Forrestal Road Princeton, NJ 08540

P. (609) 452-5359 F. (609) 987-5063

On Tue, Dec 9, 2014 at 6:50 PM, Niki Zadeh notifications@github.com wrote:

Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.

The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.

We have run into this limitation in our CM4 runs.

Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.

— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12.

— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66444958 .

jwdGFDL commented 9 years ago

For your awareness, Seth is looking at new approaches for the pp.starter "which would greatly reduce the amount of variables passed via moab.
This should also make the pp.starter more robust as certain character will not cause the pp.starter step to fail."

We'll keep you all updated.

On 12/10/2014 08:34 AM, Alistair Adcroft (GFDL) wrote:

How many refineDiag scripts did we have and how many will we have?

Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540

On Wed, Dec 10, 2014 at 7:31 AM, John Krasting notifications@github.com wrote:

Not good news! I agree - waiting for a FRE/MOAB fix could take too long. That's the preferred solution, so I don't think we should give up on that option even if we find a band-aid ourselves.


Dr. John Krasting Physical Scientist (NOAA Federal) NOAA/Geophysical Fluid Dynamics Laboratory Biogeochemistry, Ecosystems, and Climate Group Princeton University Forrestal Campus 201 Forrestal Road Princeton, NJ 08540

P. (609) 452-5359 F. (609) 987-5063

On Tue, Dec 9, 2014 at 6:50 PM, Niki Zadeh notifications@github.com wrote:

Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.

The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.

We have run into this limitation in our CM4 runs.

Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.

— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12.

— Reply to this email directly or view it on GitHub

https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66444958 .

— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66451623.

Jeff Durachta Engineering Lead for Modeling Services NOAA Geophysical Fluid Dynamics Lab Forrestal Campus, Princeton University 201 Forrestal Road Princeton, NJ 08540 Office: +1-609-987-5054

adcroft commented 9 years ago

Certain characters? I probably don't want to know.

Is the fix for chaco or sooner?

Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540

On Wed, Dec 10, 2014 at 12:13 PM, mom6jwd notifications@github.com wrote:

For your awareness, Seth is looking at new approaches for the pp.starter "which would greatly reduce the amount of variables passed via moab. This should also make the pp.starter more robust as certain character will not cause the pp.starter step to fail."

We'll keep you all updated.

On 12/10/2014 08:34 AM, Alistair Adcroft (GFDL) wrote:

How many refineDiag scripts did we have and how many will we have?

Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540

On Wed, Dec 10, 2014 at 7:31 AM, John Krasting <notifications@github.com

wrote:

Not good news! I agree - waiting for a FRE/MOAB fix could take too long. That's the preferred solution, so I don't think we should give up on that option even if we find a band-aid ourselves.


Dr. John Krasting Physical Scientist (NOAA Federal) NOAA/Geophysical Fluid Dynamics Laboratory Biogeochemistry, Ecosystems, and Climate Group Princeton University Forrestal Campus 201 Forrestal Road Princeton, NJ 08540

P. (609) 452-5359 F. (609) 987-5063

On Tue, Dec 9, 2014 at 6:50 PM, Niki Zadeh notifications@github.com wrote:

Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.

The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.

We have run into this limitation in our CM4 runs.

Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.

— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12.

— Reply to this email directly or view it on GitHub

< https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66444958

.

— Reply to this email directly or view it on GitHub < https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66451623 .

Jeff Durachta Engineering Lead for Modeling Services NOAA Geophysical Fluid Dynamics Lab Forrestal Campus, Princeton University 201 Forrestal Road Princeton, NJ 08540 Office: +1-609-987-5054

— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66486867 .

jwdGFDL commented 9 years ago

We are looking at this for the bronx infrastructure.

On 12/10/2014 06:58 PM, Alistair Adcroft (GFDL) wrote:

Certain characters? I probably don't want to know.

Is the fix for chaco or sooner?

Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540

Jeff Durachta Engineering Lead for Modeling Services NOAA Geophysical Fluid Dynamics Lab Forrestal Campus, Princeton University 201 Forrestal Road Princeton, NJ 08540 Office: +1-609-987-5054