Open nikizadehgfdl opened 9 years ago
Not good news! I agree - waiting for a FRE/MOAB fix could take too long. That's the preferred solution, so I don't think we should give up on that option even if we find a band-aid ourselves.
Dr. John Krasting Physical Scientist (NOAA Federal) NOAA/Geophysical Fluid Dynamics Laboratory Biogeochemistry, Ecosystems, and Climate Group Princeton University Forrestal Campus 201 Forrestal Road Princeton, NJ 08540
P. (609) 452-5359 F. (609) 987-5063
On Tue, Dec 9, 2014 at 6:50 PM, Niki Zadeh notifications@github.com wrote:
Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.
The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.
We have run into this limitation in our CM4 runs.
Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.
— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12.
How many refineDiag scripts did we have and how many will we have?
Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540
On Wed, Dec 10, 2014 at 7:31 AM, John Krasting notifications@github.com wrote:
Not good news! I agree - waiting for a FRE/MOAB fix could take too long. That's the preferred solution, so I don't think we should give up on that option even if we find a band-aid ourselves.
Dr. John Krasting Physical Scientist (NOAA Federal) NOAA/Geophysical Fluid Dynamics Laboratory Biogeochemistry, Ecosystems, and Climate Group Princeton University Forrestal Campus 201 Forrestal Road Princeton, NJ 08540
P. (609) 452-5359 F. (609) 987-5063
On Tue, Dec 9, 2014 at 6:50 PM, Niki Zadeh notifications@github.com wrote:
Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.
The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.
We have run into this limitation in our CM4 runs.
Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.
— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12.
— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66444958 .
For your awareness, Seth is looking at new approaches for the pp.starter
"which would greatly reduce the amount of variables passed via moab.
This should also make the pp.starter more robust as certain character
will not cause the pp.starter step to fail."
We'll keep you all updated.
On 12/10/2014 08:34 AM, Alistair Adcroft (GFDL) wrote:
How many refineDiag scripts did we have and how many will we have?
Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540
On Wed, Dec 10, 2014 at 7:31 AM, John Krasting notifications@github.com wrote:
Not good news! I agree - waiting for a FRE/MOAB fix could take too long. That's the preferred solution, so I don't think we should give up on that option even if we find a band-aid ourselves.
Dr. John Krasting Physical Scientist (NOAA Federal) NOAA/Geophysical Fluid Dynamics Laboratory Biogeochemistry, Ecosystems, and Climate Group Princeton University Forrestal Campus 201 Forrestal Road Princeton, NJ 08540
P. (609) 452-5359 F. (609) 987-5063
On Tue, Dec 9, 2014 at 6:50 PM, Niki Zadeh notifications@github.com wrote:
Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.
The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.
We have run into this limitation in our CM4 runs.
Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.
— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12.
— Reply to this email directly or view it on GitHub
https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66444958 .
— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66451623.
Jeff Durachta Engineering Lead for Modeling Services NOAA Geophysical Fluid Dynamics Lab Forrestal Campus, Princeton University 201 Forrestal Road Princeton, NJ 08540 Office: +1-609-987-5054
Certain characters? I probably don't want to know.
Is the fix for chaco or sooner?
Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540
On Wed, Dec 10, 2014 at 12:13 PM, mom6jwd notifications@github.com wrote:
For your awareness, Seth is looking at new approaches for the pp.starter "which would greatly reduce the amount of variables passed via moab. This should also make the pp.starter more robust as certain character will not cause the pp.starter step to fail."
We'll keep you all updated.
On 12/10/2014 08:34 AM, Alistair Adcroft (GFDL) wrote:
How many refineDiag scripts did we have and how many will we have?
Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540
On Wed, Dec 10, 2014 at 7:31 AM, John Krasting <notifications@github.com
wrote:
Not good news! I agree - waiting for a FRE/MOAB fix could take too long. That's the preferred solution, so I don't think we should give up on that option even if we find a band-aid ourselves.
Dr. John Krasting Physical Scientist (NOAA Federal) NOAA/Geophysical Fluid Dynamics Laboratory Biogeochemistry, Ecosystems, and Climate Group Princeton University Forrestal Campus 201 Forrestal Road Princeton, NJ 08540
P. (609) 452-5359 F. (609) 987-5063
On Tue, Dec 9, 2014 at 6:50 PM, Niki Zadeh notifications@github.com wrote:
Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.
The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job. BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.
We have run into this limitation in our CM4 runs.
Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.
— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12.
— Reply to this email directly or view it on GitHub
< https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66444958
.
— Reply to this email directly or view it on GitHub < https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66451623 .
Jeff Durachta Engineering Lead for Modeling Services NOAA Geophysical Fluid Dynamics Lab Forrestal Campus, Princeton University 201 Forrestal Road Princeton, NJ 08540 Office: +1-609-987-5054
— Reply to this email directly or view it on GitHub https://github.com/CommerceGov/NOAA-GFDL-MOM6-examples/issues/12#issuecomment-66486867 .
We are looking at this for the bronx infrastructure.
On 12/10/2014 06:58 PM, Alistair Adcroft (GFDL) wrote:
Certain characters? I probably don't want to know.
Is the fix for chaco or sooner?
Dr Alistair Adcroft (Alistair.Adcroft@noaa.gov) Princeton University Tel: (609) 987-5073 NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540
Jeff Durachta Engineering Lead for Modeling Services NOAA Geophysical Fluid Dynamics Lab Forrestal Campus, Princeton University 201 Forrestal Road Princeton, NJ 08540 Office: +1-609-987-5054
Due to a limitation of moab it is safer to have at most one refineDiag script in the xml.
The limitation is as follows: FRE adds the pathnames of all refineDiag scripts to some environment variable (on top of some other variables). That ENV then get prepended to all the commands that appear in the gfdl platform section. The whole thing is then dispatched by moab (on gaea) to run on PAN (at gfdl) as the pp.starter job.
BUT moab has a hard set limit for the length of that ENV and if it is longer it is just cut (most likely in the middle of a command string). Hence the commands that we put in the section of gfdl platform could (and did) become useless and lead to PP errors.
We have run into this limitation in our CM4 runs.
Since the refineDiag scripts pathnames are part of this ENV it is advisable to have at most one refineDiag to save something like 130 chars and avoid these situation as much as possible. A band-aid, I know, but should we rather wait for moab/FRE to fix the issue? That would take a few months at least.