Closed jimfrimel closed 5 years ago
This is a problem that should not be a problem. Reverted Jet from rocoto/1.3.0 to rocoto/1.3.0-RC5
Using Jim’s large XML file I have confirmed that the problem is indeed caused by <envar>
tags not being used consecutively. - Chris Harrop
Basically <envar>
tags and &ENV_VARS need to be grouped together.
For example, in tasks/ensda_pre.ent there is this:
<task name="ensda_pre" maxtries="&MAX_TRIES;">
<command>&PRE; &EXhwrf;/exhwrf_ensda_pre.py</command>
<jobname>hwrf_ensda_pre_&SID;_<cyclestr>@Y@m@d@H</cyclestr></jobname>
<account>&ACCOUNT;</account>
<queue>&SERIAL;</queue>
<cores>1</cores>
<envar>
<name>TOTAL_TASKS</name>
<value>1</value>
</envar>
<walltime>00:15:00</walltime>
<memory>1G</memory>
<join><cyclestr>&WORKhwrf;/hwrf_ensda_pre.log</cyclestr></join>
&ENV_VARS;
&RESERVATION;
&SERIAL_EXTRA;
&CORES_EXTRA;
The &ENV_VARS; entity contains a bunch of
<task name="ensda_pre" maxtries="&MAX_TRIES;">
<command>&PRE; &EXhwrf;/exhwrf_ensda_pre.py</command>
<jobname>hwrf_ensda_pre_&SID;_<cyclestr>@Y@m@d@H</cyclestr></jobname>
<account>&ACCOUNT;</account>
<queue>&SERIAL;</queue>
<cores>1</cores>
<walltime>00:15:00</walltime>
<memory>1G</memory>
<join><cyclestr>&WORKhwrf;/hwrf_ensda_pre.log</cyclestr></join>
<envar>
<name>TOTAL_TASKS</name>
<value>1</value>
</envar>
&ENV_VARS;
&RESERVATION;
&SERIAL_EXTRA;
&CORES_EXTRA;
From Chris Harrop
Since Theia is down right now, I can’t compare the versions of system libraries between Jet and Theia. But, the Rocoto schema is the same regardless of where it is installed. The libraries the RelaxNG library (the thing that does the validation) uses could be different, though, depending on the OS version and versions of various system packages.
The only issue that I’ve observed is that it doesn’t like having other tags interspersed between
Related to rocoto 1.3.0 release ...
This behavior is unexpected and was not discovered until after the release. There was an update to the validation schema to allow the
I am working to fix the issue with the schema and the code that validates it now, but do not yet have an ETA for a fix. But the fix will be in version 1.3.1.
Some of My notes - troubleshooting rocoto Errors are placed in the log file in the ~/.rocoto directory
do these steps to pin down the line number ... depending on the error ...
prompt> xmllint --noent
If you get an error ... than rocotorun on your ouput.xml or even "vi" your output.xml and, syntax highlighting in vi may be helpful to pin point the error ...
prompt> vi output.xml prompt> rocotorun -w output.xml
(these notes/section are incomplete, It just indicates the schema used ...not how to run any validation ...)
The schema that is used can be found under the rocoto source ...
/apps/rocoto/1.2.2/lib/workflowmgr/ schema_with_metatasks.rng schema_without_metatasks.rng
<grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
I believe there is an option in the xmllint command to pass in the schema ... and to run validation ... I just haven't run/tested or looked in to the command ..
WAITING ...
We have reverted trunk to use rocoto/1.3.0-RC5 for the Jet modules.
Everything is working now and we are WAITING on possible next steps.
Plan is to keep rocoto 1.3.0-RC5 on Jet until the issue that effects HWRF is fixed.
Having issue running rocoto 1.3.0 issue on Jet. (see error below) 1.3.0-RC5 works. No problem running rocoto/1.3.0 on Theia. I have submitted a help ticket .... to jet help
This is only and issue with rocoto 1.3.0 and the HWRF workflow XML on Jet.
The error is consistent and repeatable, and it occurs on an active, new, 1.3.0-RC5, or completed rocoto database files, it just doesn't work for me at all ... on Jet.
Other users are reporting issues also, but some are not ? Which is odd but since it has to do with validation (see below) maybe there XML is less complex, though if running hwrf, not sure why XML workflow would be different.
Rocoto Error Output
06/10/19 21:15:28 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Extra element walltime in interleave. 06/10/19 21:15:28 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Element task failed to validate content at /mnt/lfs3/projects/dtc-hurr/James.T.Frimel/hwrf_slurm_totaltasks_fix/rocoto/hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml:1. 06/10/19 21:15:28 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Invalid sequence in interleave at /mnt/lfs3/projects/dtc-hurr/James.T.Frimel/hwrf_slurm_totaltasks_fix/rocoto/hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml:153. 06/10/19 21:15:28 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Element metatask failed to validate content at /mnt/lfs3/projects/dtc-hurr/James.T.Frimel/hwrf_slurm_totaltasks_fix/rocoto/hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml:153. 06/10/19 21:15:28 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Invalid sequence in interleave at /mnt/lfs3/projects/dtc-hurr/James.T.Frimel/hwrf_slurm_totaltasks_fix/rocoto/hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml:144. 06/10/19 21:15:28 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Element workflow failed to validate content at /mnt/lfs3/projects/dtc-hurr/James.T.Frimel/hwrf_slurm_totaltasks_fix/rocoto/hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml:144. 06/10/19 21:15:28 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Element workflow failed to validate content at /mnt/lfs3/projects/dtc-hurr/James.T.Frimel/hwrf_slurm_totaltasks_fix/rocoto/hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml:144. 06/10/19 21:15:34 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Extra element walltime in interleave. 06/10/19 21:15:34 UTC :: hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml :: Error: Element task failed to validate content at /mnt/lfs3/projects/dtc-hurr/James.T.Frimel/hwrf_slurm_totaltasks_fix/rocoto/hwrf-hwrf_slurm_totaltasks_fix-14L-2018100900.xml:1.