PecanProject / pecan

The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox.
www.pecanproject.org
Other
202 stars 234 forks source link

Possible error with do.conversions in multi-site runs #2286

Closed serbinsh closed 5 years ago

serbinsh commented 5 years ago

Bug Description

Various issues related to parsing and pasting met inputs into multi-site runs For example, site IDs don't seem to automatically paste in lat/longs

 Using sobol method for sampling
Loading required package: PEcAn.SIPNET
2019-01-22 12:06:13 INFO   [PEcAn.workflow::run.write.configs] :
   ----- Writing model run config files ----
2019-01-22 12:06:13 INFO   [write.config.SIPNET] :
   Writing SIPNET configs with input
Error in gsub("@SITE_LAT@", settings$run$site$lat, jobsh) :
  invalid 'replacement' argument
  <run>
    <multisettings>
        <multisettings>run</multisettings>
    </multisettings>
    <settings.1>
        <site>
            <id>676</id>
            <met.start>2000/01/01</met.start>
            <met.end>2006/12/31</met.end>
            <name>US-WCr</name>
            <lat>45.805925</lat>
            <lon>-90.07961</lon>
        </site>
        <inputs>
            <met>
                <id>2000000236</id>
                <path>/data/home/sserbin/.pecan/dbfiles/CRUNCEP_SIPNET_site_0-676/CRUNCEP.2000-01-01.2006-12-31.clim</path>
            </met>
        </inputs>
        <start.date>2002/01/01</start.date>
        <end.date>2005/12/31</end.date>
    </settings.1>
    <settings.2>
        <site>
            <id>622</id>
            <met.start>2002/01/01</met.start>
            <met.end>2005/12/31</met.end>
            <name>US-Syv</name>
            <lat>46.242017</lat>
            <lon>-89.347567</lon>
        </site>
        <inputs>
            <met>
                <source>AmerifluxLBL</source>
                <output>SIPNET</output>
                <username>sserbin</username>
            </met>
        </inputs>
        <start.date>2002/01/01</start.date>
        <end.date>2005/12/31</end.date>
    </settings.2>
  </run>

results in

   growth_resp_factor
2019-01-22 13:54:51 INFO   [PEcAn.uncertainty::get.parameter.samples] :
   using 5004 samples per trait
2019-01-22 13:54:51 INFO   [PEcAn.uncertainty::get.parameter.samples] :
   Selected Quantiles:
   '0.001','0.023','0.159','0.5','0.841','0.977','0.999'
2019-01-22 13:54:52 INFO   [get.ensemble.samples] :
   Using sobol method for sampling
Loading required package: PEcAn.SIPNET
2019-01-22 13:54:53 INFO   [PEcAn.workflow::run.write.configs] :
   ----- Writing model run config files ----
2019-01-22 13:54:53 INFO   [write.config.SIPNET] :
   Writing SIPNET configs with input
Error in gsub("@SITE_LAT@", settings$run$site$lat, jobsh) :
  invalid 'replacement' argument
> settings$run$site$lat
NULL

sitegroups also fail to parse in the met paths, etc during do.conversions etc

 <run>
  <multisettings>
   <multisettings>run</multisettings>
  </multisettings>
  <sitegroup>
   <id>2000000005</id>
  </sitegroup>
  <site>
   <id>2000000005</id>
   <name>Kougarok (NGEE-Arctic)</name>
   <lat>65.163451</lat>
   <lon>-164.816948</lon>
  </site>
 </run>

other errors I have seen

   Finished Model Specific Conversion
2019-01-22 14:22:55 WARN   [PEcAn.DB::db.close] :
   Connection created outside of PEcAn.DB package
2019-01-22 14:22:55 WARN   [PEcAn.settings::papply] :
   papply threw an error for element 2 of 2, but is continuing since
   stop.on.error=FALSE (there will be no results for this element,
   however). Message was: 'Error in model.id[[i]] : subscript out of bounds
   '
2019-01-22 14:22:55 WARN   [PEcAn.settings::papply] :
   papply encountered errors for 1 elements, but continued since
   stop.on.error=FALSE. Element 2: 'Error in model.id[[i]] : subscript out
   of bounds '
>

more on this thread in Slack: https://pecanproject.slack.com/archives/G8S9VG292/p1548188341203600?thread_ts=1548184137.200600&cid=G8S9VG292

@serbinsh Before this, I have been always running the sites individually to make sure they make sense, then I used put them together like the XML I sent you in DM. But now trying this, seems like `do_conversion` is not ready handling multi-settings. It crashes even on multi-settings with duplicate of a one site and it always finishes the first setting and it throws an error (regarding status file !) when it starts the second the setting.
serbinsh commented 5 years ago

@Hamze yeah what is happening to me now is that by putting


      <multisettings>run</multisettings>```
outside the run tags as you have causes pecan to provide me a pecan.CHECKED that has <pecan> as the open/close tags AND only the second site listed is kept in the run tags???!!! so weird

on the other hand I get different errors if I put the multisettings tags INSIDE  <run>  Cant seem to get it to work, seems there are some new issues when parsing settings in a multi-settings run
para2x commented 5 years ago

@serbinsh I'm back at this issue to see what's going and I realized I cannot replicate this issue. I'm going to share a very simple xml file with you. Please go through a typical workflow using this xml file see if you have any problem.

para2x commented 5 years ago
<?xml version="1.0"?>
<pecan.multi>
 <info>
    <notes></notes>
    <userid>-1</userid>
    <username></username>
    <date>2017/12/06 21:19:33 +0000</date>
  </info>
 <outdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met</outdir>
 <database>
  <bety>
   <user>bety</user>
   <password>bety</password>
   <host>128.197.168.114</host>
   <dbname>bety</dbname>
   <driver>PostgreSQL</driver>
   <write>FALSE</write>
  </bety>
  <dbfiles>/fs/data1/pecan.data/dbfiles</dbfiles>
 </database>
 <pfts>
  <pft>
   <name>temperate.broadleaf.deciduous</name>
   <constants>
    <num>1</num>
   </constants>
   <outdir>/fs/data3/hamzed/MultiSite_Project/SimpleRun/1000025731/pft/temperate.broadleaf.deciduous</outdir>
   <posteriorid>1000012409</posteriorid>
  </pft>
    <pft>
   <name>boreal.coniferous</name>
   <constants>
    <num>1</num>
   </constants>
   <outdir>/fs/data3/hamzed/MultiSite_Project/SimpleRun/796/pft/boreal.coniferous</outdir>
   <posteriorid>1000012646</posteriorid>
  </pft>
 </pfts>
 <meta.analysis>
    <iter>3000</iter>
    <random.effects>FALSE</random.effects>
  </meta.analysis>
 <ensemble>
   <size>50</size>
   <variable>NPP</variable>
   <samplingspace>
   <parameters>
    <method>lhc</method>
   </parameters>
   <met>
    <method>sampling</method>
    </met>
   </samplingspace>
  </ensemble>

 <model>
  <id>1000000022</id>
  <type>SIPNET</type>
  <revision>r136</revision>
  <delete.raw>FALSE</delete.raw>
  <binary>/fs/data5/pecan.models/SIPNET/trunk/sipnet_ssr</binary>
 </model>
 <workflow>
    <id>1000008768</id>
  </workflow>
 <run>
  <settings.1000025731>
    <site>
      <id>1000025731</id>
      <met.start>1962-01-01 00:00:00</met.start>
      <met.end>2010-12-31 00:00:00</met.end>
      <name>US-SSH</name>
      <lat>40.6658</lat>
      <lon>-77.904</lon>
    </site>
    <inputs>
   <met>
    <source>CRUNCEP</source>
    <output>SIPNET</output>
   </met>
    </inputs>
    <start.date>1980/01/01</start.date>
    <end.date>2010/12/31</end.date>
  </settings.1000025731>
  <settings.1000000048>
    <site>
      <id>1000000048</id>
      <met.start>2004-01-01 00:00:00</met.start>
      <met.end>2004-12-31 00:00:00</met.end>
      <name>US-CZ3</name>
      <lat>37.0678</lat>
      <lon>-119.1944</lon>
    </site>
       <inputs>
   <met>
    <source>CRUNCEP</source>
    <output>SIPNET</output>
   </met>
    </inputs>
    <start.date>1980/01/01</start.date>
    <end.date>2010/12/31</end.date>
  </settings.1000000048>
  <settings.763>
  <site>
   <id>763</id>
   <met.start>2004/01/01</met.start>
   <met.end>2004/12/31</met.end>
   <name>US-Me2</name>
   <lat>44.4524</lat>
   <lon>-121.557</lon>
  </site>
  <inputs>
   <met>
    <source>CRUNCEP</source>
    <output>SIPNET</output>
   </met>
  </inputs>
  <start.date>1980/01/01</start.date>
  <end.date>2010/12/31</end.date>
  </settings.763>

 </run>
 <host>
  <name>localhost</name>
  <rundir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met/run</rundir>
  <outdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met/out</outdir>
 </host>
 <settings.info>
  <deprecated.settings.fixed>TRUE</deprecated.settings.fixed>
  <settings.updated>TRUE</settings.updated>
  <checked>TRUE</checked>
 </settings.info>
 <rundir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met/run</rundir>
 <modeloutdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met/out</modeloutdir>
 <multisettings>run</multisettings>
</pecan.multi>
para2x commented 5 years ago

This is a very simple ensemble run for three sites (not even sda) and for defining the sites met is defined like this :

   <met>
    <source>CRUNCEP</source>
    <output>SIPNET</output>
   </met>

and multisetting tag like this: <multisettings>run</multisettings> Let me if this works for you .

para2x commented 5 years ago

This xml file worked without throwing any errors from the first line of workflow to fully running the model for all the sites.

para2x commented 5 years ago

I tested the site-group as well with this xml template:

<?xml version="1.0"?>
<pecan.multi>
 <info>
    <notes></notes>
    <userid>-1</userid>
    <username></username>
    <date>2017/12/06 21:19:33 +0000</date>
  </info>
 <outdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup</outdir>
 <database>
  <bety>
   <user>bety</user>
   <password>bety</password>
   <host>128.197.168.114</host>
   <dbname>bety</dbname>
   <driver>PostgreSQL</driver>
   <write>FALSE</write>
  </bety>
  <dbfiles>/fs/data1/pecan.data/dbfiles</dbfiles>
 </database>
 <pfts>
  <pft>
   <name>temperate.broadleaf.deciduous</name>
   <constants>
    <num>1</num>
   </constants>
   <outdir>/fs/data3/hamzed/MultiSite_Project/SimpleRun/1000025731/pft/temperate.broadleaf.deciduous</outdir>
   <posteriorid>1000012409</posteriorid>
  </pft>
    <pft>
   <name>boreal.coniferous</name>
   <constants>
    <num>1</num>
   </constants>
   <outdir>/fs/data3/hamzed/MultiSite_Project/SimpleRun/796/pft/boreal.coniferous</outdir>
   <posteriorid>1000012646</posteriorid>
  </pft>
 </pfts>
 <meta.analysis>
    <iter>3000</iter>
    <random.effects>FALSE</random.effects>
  </meta.analysis>
 <ensemble>
   <size>10</size>
   <variable>NPP</variable>
   <samplingspace>
   <parameters>
    <method>lhc</method>
   </parameters>
   <met>
    <method>sampling</method>
    </met>
   </samplingspace>
  </ensemble>
<sitegroup>
   <id>1000000022</id>
</sitegroup>
 <model>
  <id>1000000022</id>
  <type>SIPNET</type>
  <revision>r136</revision>
  <delete.raw>FALSE</delete.raw>
  <binary>/fs/data5/pecan.models/SIPNET/trunk/sipnet_ssr</binary>
 </model>
 <workflow>
    <id>1000008768</id>
  </workflow>
 <run>
<inputs>
  <met>
  <source>CRUNCEP</source>
  <output>SIPNET</output>
  </met>
  </inputs>
  <start.date>1980/01/01</start.date>
  <end.date>2010/12/31</end.date>
 </run>
 <host>
  <name>localhost</name>
  <rundir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup/run</rundir>
  <outdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup/out</outdir>
 </host>
 <settings.info>
  <deprecated.settings.fixed>TRUE</deprecated.settings.fixed>
  <settings.updated>TRUE</settings.updated>
  <checked>TRUE</checked>
 </settings.info>
 <rundir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup/run</rundir>
 <modeloutdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup/out</modeloutdir>
</pecan.multi>
para2x commented 5 years ago

What's different about this example is that the run looks like this :

  <met>
  <source>CRUNCEP</source>
  <output>SIPNET</output>
  </met>
  </inputs>
  <start.date>1980/01/01</start.date>
  <end.date>2010/12/31</end.date>

I don't have the multisetting tag anymore and instead I have the <sitegroup> like this :

<sitegroup>
   <id>1000000022</id>
</sitegroup>
serbinsh commented 5 years ago

@bailsofhay Can you try to run a multi-site run based on this example from @para2x

bailsofhay commented 5 years ago

@para2x I'm testing this with 1 site in a site group (Bailey_CMS_SDA). Shawn said the site_pft.csv file is used to identify which pft is used for a site when there are more than 1 sites in a site group. How do I incorporate this file into the XML

bailsofhay commented 5 years ago

For instance: I tested my one site site group with 2 pfts listed in the XML, and it ran for both pfts, which shouldn't happen. when I add in more than 1 site to the group and list more than 1 pft hardcoded in the XML, it will run all the pfts for each site instead of the site the pft was supposed to be assigned to.

para2x commented 5 years ago

In general, this issue has not nothing to do with linking sites with PFTs. We add all the PFTs that might be used in writing the configs in our xml file so that pecan would generate samples for all of them. Then the lookup table inside the site_pft.csv will be used in the write.config to write PFT specific configs for different sites. However, this issue was more concerned about the do_conversions function and if that works for multisetting setting files. @bailsofhay this is my PR introducing the link between sites and PFTs #2144; You can give it a look to get a better sense of how that works.

para2x commented 5 years ago

@bailsofhay Ignore my comment above ! The mechanics is right but I need to test something. I'll get back to you in few minutes.

para2x commented 5 years ago

Ok, so what I said above is correct, in addition to that in prepare.settings function we check for the tag:

<inputs>
    <pft.site>
  <path>site_pft.csv</path>
  </pft.site> 
...

which Pecan in prepare setting automatically adds a tag to the site tag expanded from site group to specify which PFT should be used for this site.

para2x commented 5 years ago

This xml worked without a problem for me. Creating a multisetting xml file from a sitegroup and adding pft defined for each site in site_pft.csv to the site tag. Make sure you have your csv next to the xml file.

<?xml version="1.0"?>
<pecan.multi>
 <info>
    <notes></notes>
    <userid>-1</userid>
    <username></username>
    <date>2017/12/06 21:19:33 +0000</date>
  </info>
 <outdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup_sitePFT</outdir>
 <database>
  <bety>
   <user>bety</user>
   <password>bety</password>
   <host>128.197.168.114</host>
   <dbname>bety</dbname>
   <driver>PostgreSQL</driver>
   <write>FALSE</write>
  </bety>
  <dbfiles>/fs/data1/pecan.data/dbfiles</dbfiles>
 </database>
 <pfts>
  <pft>
   <name>temperate.broadleaf.deciduous</name>
   <constants>
    <num>1</num>
   </constants>
   <outdir>/fs/data3/hamzed/MultiSite_Project/SimpleRun/1000025731/pft/temperate.broadleaf.deciduous</outdir>
   <posteriorid>1000012409</posteriorid>
  </pft>
    <pft>
   <name>boreal.coniferous</name>
   <constants>
    <num>1</num>
   </constants>
   <outdir>/fs/data3/hamzed/MultiSite_Project/SimpleRun/796/pft/boreal.coniferous</outdir>
   <posteriorid>1000012646</posteriorid>
  </pft>
 </pfts>
 <meta.analysis>
    <iter>3000</iter>
    <random.effects>FALSE</random.effects>
  </meta.analysis>
 <ensemble>
   <size>10</size>
   <variable>NPP</variable>
   <samplingspace>
   <parameters>
    <method>lhc</method>
   </parameters>
   <met>
    <method>sampling</method>
    </met>
   </samplingspace>
  </ensemble>
<sitegroup>
   <id>1000000022</id>
</sitegroup>
 <model>
  <id>1000000022</id>
  <type>SIPNET</type>
  <revision>r136</revision>
  <delete.raw>FALSE</delete.raw>
  <binary>/fs/data5/pecan.models/SIPNET/trunk/sipnet_ssr</binary>
 </model>
 <workflow>
    <id>1000008768</id>
  </workflow>
 <run>
<inputs>
    <pft.site>
  <path>site_pft.csv</path>
  </pft.site> 
  <met>
  <source>CRUNCEP</source>
  <output>SIPNET</output>
  </met>
  </inputs>
  <start.date>1980/01/01</start.date>
  <end.date>2010/12/31</end.date>
 </run>
 <host>
  <name>localhost</name>
  <rundir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup_sitePFT/run</rundir>
  <outdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup_sitePFT/out</outdir>
 </host>
 <settings.info>
  <deprecated.settings.fixed>TRUE</deprecated.settings.fixed>
  <settings.updated>TRUE</settings.updated>
  <checked>TRUE</checked>
 </settings.info>
 <rundir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup_sitePFT/run</rundir>
 <modeloutdir>/fs/data3/hamzed/Projects/MultiSite_Sandbox/MultiSite_met_sitegroup_sitePFT/out</modeloutdir>
</pecan.multi>
bailsofhay commented 5 years ago

@para2x this is what is working for me (minus adding in the sites_pft.csv file). I'm still waiting for the job to finish. but it looks like it's working for a site group. My XML is a little different since we are running on MODEX.


<pecan>
  <!-- Main output director for entire PEcAn workflow -->
  <outdir>/data/bmorrison/sda/site_group/test</outdir>

  <!-- define database connection.  Generally use "bety" as user -->
  <database>
    <bety>
      <user>bety</user>
      <password>bety</password>
      <host>localhost</host>
      <port>5432</port>
      <dbname>bety</dbname>
      <driver>PostgreSQL</driver>
      <write>FALSE</write>
    </bety>
    <dbfiles>/data/pecan_dbfiles</dbfiles>
  </database>

  <!-- define a PFT(s) for specified model (e.g. SIPNET) using those availible in the BETYdb -->
  <pfts>
    <pft>
      <name>temperate.needleleaf.evergreen</name>
      <constants>
        <num>1</num>
      </constants>
    </pft>
    <pft>
      <name>temperate.broadleaf.deciduous</name>
      <constants>
        <num>1</num>
      </constants>
    </pft>
  </pfts>

  <!-- setup trait meta-analysis.  "random.effects" can be T/F.  Generally F but T if you want to include site effects" -->
  <meta.analysis>
    <iter>3000</iter>
    <!--<iter>100000</iter>-->
    <random.effects>FALSE</random.effects>
    <!--<random.effects>TRUE</random.effects>-->
  </meta.analysis>

  <!-- setup ensemble runs. these represent runs where we sample the full parameter-space posteriors to define each new ensemble -->
  <ensemble>
   <size>100</size>
   <variable>NPP</variable>
   <samplingspace>
   <parameters>
    <method>sobol</method>
   </parameters>
   <met>
       <method>sampling</method>
   </met>
   </samplingspace>
  </ensemble>

  <!-- setup SA runs -->
  <sensitivity.analysis>
    <quantiles>
      <sigma>-3</sigma>
      <sigma>-2</sigma>
      <sigma>-1</sigma>
      <sigma>0</sigma>
      <sigma>1</sigma>
      <sigma>2</sigma>
      <sigma>3</sigma>
    </quantiles>
    <variable>NPP</variable>
  </sensitivity.analysis>

  <!-- model tags, and options -->
  <model>
    <id>1000000014</id>
  </model>
<!-- run tags and options -->
  <run>
    <sitegroup>
      <id>2000000006</id>
      <met.start>1980/01/01</met.start>
      <met.end>2010/12/31</met.end>
    </sitegroup>
    <inputs>
      <met>
         <source>CRUNCEP</source>
         <output>SIPNET</output>
      </met>
    </inputs>
    <start.date>1980/01/01</start.date>
    <end.date>2010/12/31</end.date>
  </run>

  <!-- host tags -->
  <host>
    <name>localhost</name>
    <scratchdir>/scratch</scratchdir>
    <prerun>module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/\
2.3.1</prerun>
    <!--<qsub>qsub -l walltime=36:00:00,nodes=2:ppn=10 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>-->
    <qsub>qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>
    <qsub.jobid>([[:digit:]]+)\.modex\.bnl\.gov</qsub.jobid>
    <qstat>qstat @JOBID@ || echo DONE</qstat>
    <!--<modellauncher>
        <binary>/data/sserbin/Modeling/pecan/contrib/modellauncher/modellauncher</binary>
        <qsub.extra>-l ncpus=10</qsub.extra>
    </modellauncher>-->
  </host>

  <email>
    <to>bmorrison@bnl.gov</to>
  </email>

</pecan>````
para2x commented 5 years ago

@bailsofhay That's great that is working. Let me know please if you could successfully test site_pft as well.

serbinsh commented 5 years ago

@para2x I dont think its working as expected.

Bailey is running site group 2000000006 which is supposed to be: " Bailey_CMS_SDA" where she is trying to test run multi-site. PEcAn happily chugs along until model2netCDF because the run dates and input dates dont match. A good catch because in fact, despite the met file generated to run the run at the site has the name "CRUNCEP.1980-01-01.2010-12-31.clim" the years requested, the content of the file is 1990-2006????

pecan.METProcess.xml.txt

note the site group is incorrectly named in the XML?

Its confusing this https://modex.bnl.gov/bety/inputs/2000000295 with the correct input and that file has 1990-2006 for some reason, I have no idea why, despite its name 1980-2010

So that incorrect met file was generated during the multi-site run so met process did not work properly in that workflow

serbinsh commented 5 years ago

Here is my basic example that has the same behavior

<?xml version="1.0"?>
<pecan>
  <!-- Main output director for entire PEcAn workflow -->
  <outdir>/data/sserbin/Modeling/sipnet/multi_site/testrun</outdir>

  <!-- define database connection.  Generally use "bety" as user -->
  <database>
    <bety>
      <user>bety</user>
      <password>bety</password>
      <host>localhost</host>
      <port>5432</port>
      <dbname>bety</dbname>
      <driver>PostgreSQL</driver>
      <write>FALSE</write>
    </bety>
    <dbfiles>/data/pecan_dbfiles</dbfiles>
  </database>

  <!-- define a PFT(s) for specified model (e.g. SIPNET) using those availible in the BETYdb -->
  <pfts>
    <pft>
      <name>temperate.needleleaf.evergreen</name> 
      <constants>
        <num>1</num>
      </constants>
    </pft>
  </pfts>

  <!-- setup trait meta-analysis.  "random.effects" can be T/F.  Generally F but T if you want to include site effects" -->
  <meta.analysis>
    <iter>3000</iter>
    <!--<iter>100000</iter>-->
    <random.effects>FALSE</random.effects>
    <!--<random.effects>TRUE</random.effects>-->
  </meta.analysis>

  <!-- setup ensemble runs. these represent runs where we sample the full parameter-space posteriors to define each new ensemble -->
  <ensemble>
   <size>100</size>
   <variable>NPP</variable>
   <samplingspace>
   <parameters>
    <method>sobol</method>
   </parameters>
   <met>
       <method>sampling</method>
   </met>
   </samplingspace>
  </ensemble>

  <!-- setup SA runs -->
  <sensitivity.analysis>
    <quantiles>
      <sigma>-3</sigma>
      <sigma>-2</sigma>
      <sigma>-1</sigma>
      <sigma>0</sigma>
      <sigma>1</sigma>
      <sigma>2</sigma>
      <sigma>3</sigma>
    </quantiles>
    <variable>NPP</variable>
  </sensitivity.analysis>

  <!-- model tags, and options -->
  <model>
    <id>1000000014</id>
  </model>

  <!-- run tags and options -->
  <run>
    <sitegroup>
      <id>2000000006</id>
    </sitegroup> 
    <inputs>
      <met>
         <source>CRUNCEP</source>
         <output>SIPNET</output>
      </met>
    </inputs>
    <start.date>1980/01/01</start.date>
    <end.date>2010/12/31</end.date>
  </run>

  <!-- host tags --> 
  <host>
    <name>localhost</name>
    <scratchdir>/scratch</scratchdir>
    <prerun>module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1</prerun>
    <!--<qsub>qsub -l walltime=36:00:00,nodes=2:ppn=10 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>-->
    <qsub>qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>
    <qsub.jobid>([[:digit:]]+)\.modex\.bnl\.gov</qsub.jobid>
    <qstat>qstat @JOBID@ || echo DONE</qstat>
    <!--<modellauncher>
        <binary>/data/sserbin/Modeling/pecan/contrib/modellauncher/modellauncher</binary>
        <qsub.extra>-l ncpus=10</qsub.extra>
    </modellauncher>-->
  </host>

  <email>
    <to>sserbin@@bnl.gov</to>
  </email>

</pecan>
serbinsh commented 5 years ago

According to @para2x we had sitegroup tags in the wrong place.

moved them out here

<?xml version="1.0"?>
<pecan>
  <!-- Main output director for entire PEcAn workflow -->
  <outdir>/data/sserbin/Modeling/sipnet/multi_site/testrun</outdir>

  <!-- define database connection.  Generally use "bety" as user -->
  <database>
    <bety>
      <user>bety</user>
      <password>bety</password>
      <host>localhost</host>
      <port>5432</port>
      <dbname>bety</dbname>
      <driver>PostgreSQL</driver>
      <write>FALSE</write>
    </bety>
    <dbfiles>/data/pecan_dbfiles</dbfiles>
  </database>

  <!-- define a PFT(s) for specified model (e.g. SIPNET) using those availible in the BETYdb -->
  <pfts>
    <pft>
      <name>temperate.needleleaf.evergreen</name> 
      <constants>
        <num>1</num>
      </constants>
    </pft>
  </pfts>

  <!-- setup trait meta-analysis.  "random.effects" can be T/F.  Generally F but T if you want to include site effects" -->
  <meta.analysis>
    <iter>3000</iter>
    <!--<iter>100000</iter>-->
    <random.effects>FALSE</random.effects>
    <!--<random.effects>TRUE</random.effects>-->
  </meta.analysis>

  <!-- setup ensemble runs. these represent runs where we sample the full parameter-space posteriors to define each new ensemble -->
  <ensemble>
   <size>100</size>
   <variable>NPP</variable>
   <samplingspace>
   <parameters>
    <method>sobol</method>
   </parameters>
   <met>
       <method>sampling</method>
   </met>
   </samplingspace>
  </ensemble>

  <!-- setup SA runs -->
  <sensitivity.analysis>
    <quantiles>
      <sigma>-3</sigma>
      <sigma>-2</sigma>
      <sigma>-1</sigma>
      <sigma>0</sigma>
      <sigma>1</sigma>
      <sigma>2</sigma>
      <sigma>3</sigma>
    </quantiles>
    <variable>NPP</variable>
  </sensitivity.analysis>

  <!-- model tags, and options -->
  <model>
    <id>1000000014</id>
  </model>

  <sitegroup>
      <id>2000000006</id>
  </sitegroup> 

  <!-- run tags and options -->
  <run>
    <inputs>
      <met>
         <source>CRUNCEP</source>
         <output>SIPNET</output>
      </met>
    </inputs>
    <start.date>1980/01/01</start.date>
    <end.date>2010/12/31</end.date>
  </run>

  <!-- host tags --> 
  <host>
    <name>localhost</name>
    <scratchdir>/scratch</scratchdir>
    <prerun>module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1</prerun>
    <!--<qsub>qsub -l walltime=36:00:00,nodes=2:ppn=10 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>-->
    <qsub>qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>
    <qsub.jobid>([[:digit:]]+)\.modex\.bnl\.gov</qsub.jobid>
    <qstat>qstat @JOBID@ || echo DONE</qstat>
    <!--<modellauncher>
        <binary>/data/sserbin/Modeling/pecan/contrib/modellauncher/modellauncher</binary>
        <qsub.extra>-l ncpus=10</qsub.extra>
    </modellauncher>-->
  </host>

  <email>
    <to>sserbin@@bnl.gov</to>
  </email>

</pecan>

seems to be generating met drivers now

> remotefunc <- function() {PEcAn.data.atmosphere::download.CRUNCEP(site_id=1000004891, lat.in=37.00583, lon.in=-119.00602, model=NULL, scenario=NULL, ensemble_member=NULL, method=NULL, overwrite=FALSE, outfolder='/data/pecan_dbfiles/CRUNCEP_site_1-4891/', start_date='1980/01/01', end_date='2010/12/31')}
> remoteout <- remotefunc()
trying URL 'https://thredds.daac.ornl.gov/thredds/ncss/ornldaac/1220/mstmip_driver_global_hd_landwatermask_v1.nc4?var=land_water_mask&disableLLSubset=on&disableProjSubset=on&horizStride=1&accept=netcdf'
downloaded 262 KB

2019-03-11 12:32:56 INFO   [PEcAn.data.atmosphere::download.CRUNCEP] :
   Downloading /data/pecan_dbfiles/CRUNCEP_site_1-4891//CRUNCEP.1980.nc
2019-03-11 12:32:56 INFO   [PEcAn.data.atmosphere::download.CRUNCEP] :
   Attempting to access file at:
   https://thredds.daac.ornl.gov/thredds/dodsC/ornldaac/1220/mstmip_driver_global_hd_climate_tair_1980_v1.nc4
2019-03-11 12:33:04 INFO   [PEcAn.data.atmosphere::download.CRUNCEP] :
   Attempting to access file at:
   https://thredds.daac.ornl.gov/thredds/dodsC/ornldaac/1220/mstmip_driver_global_hd_climate_lwdown_1980_v1.nc4
2019-03-11 12:33:13 INFO   [PEcAn.data.atmosphere::download.CRUNCEP] :
serbinsh commented 5 years ago

Based on what is written in here (https://pecanproject.github.io/pecan-documentation/develop/pecanXML.html#xml-multi-settings) and back and forth on Slack i am still not able to fix this error:

2019-03-12 12:25:41 INFO   [check.model.settings] :
   Setting model binary to /data/software/SIPNET/sipnet_r136/sipnet.r136
2019-03-12 12:25:41 INFO   [fn] : path
2019-03-12 12:25:41 INFO   [fn] : path
2019-03-12 12:25:41 INFO   [fn] :
   Missing optional input : poolinitcond
2019-03-12 12:25:41 INFO   [check.workflow.settings] :
   output folder = /data/sserbin/Modeling/sipnet/multi_site/testrun.9
2019-03-12 12:25:41 INFO   [check.settings] :
   Storing pft temperate.needleleaf.evergreen in
   /data/sserbin/Modeling/sipnet/multi_site/testrun.9/pft/temperate.needleleaf.evergreen
Error in attr(result, "settingType") <- "global" :
  attempt to set an attribute on NULL

The XML is pretty much the same to @para2x except for where I didnt include tags that pecan generates

<?xml version="1.0"?>
<pecan.multi>
  <!-- Main output director for entire PEcAn workflow -->
  <outdir>/data/sserbin/Modeling/sipnet/multi_site/testrun.10</outdir>

  <!-- define database connection.  Generally use "bety" as user -->
  <database>
    <bety>
      <user>bety</user>
      <password>bety</password>
      <host>localhost</host>
      <port>5432</port>
      <dbname>bety</dbname>
      <driver>PostgreSQL</driver>
      <write>FALSE</write>
    </bety>
    <dbfiles>/data/pecan_dbfiles</dbfiles>
  </database>

  <!-- define a PFT(s) for specified model (e.g. SIPNET) using those availible in the BETYdb -->
  <pfts>
    <pft>
      <name>temperate.needleleaf.evergreen</name> 
      <constants>
        <num>1</num>
      </constants>
    </pft>
  </pfts>

  <!-- setup trait meta-analysis.  "random.effects" can be T/F.  Generally F but T if you want to include site effects" -->
  <meta.analysis>
    <iter>3000</iter>
    <random.effects>FALSE</random.effects>
  </meta.analysis>

  <!-- setup ensemble runs. these represent runs where we sample the full parameter-space posteriors to define each new ensemble -->
  <ensemble>
   <size>10</size>
   <variable>NPP</variable>
   <samplingspace>
   <parameters>
    <method>lhc</method>
   </parameters>
   <met>
       <method>sampling</method>
   </met>
   </samplingspace>
  </ensemble>

  <!-- model tags, and options -->
  <model>
    <id>1000000014</id>
  </model>

  <sitegroup>
      <id>2000000006</id>
  </sitegroup> 

  <!-- run tags and options -->
  <run>
    <inputs>
      <met>
         <source>CRUNCEP</source>
         <output>SIPNET</output>
      </met>
    </inputs>
    <start.date>1980/01/01</start.date>
    <end.date>2010/12/31</end.date>
  </run>

  <!-- host tags --> 
  <host>
    <name>localhost</name>
    <scratchdir>/scratch</scratchdir>
    <prerun>module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1</prerun>
    <qsub>qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>
    <qsub.jobid>([[:digit:]]+)\.modex\.bnl\.gov</qsub.jobid>
    <qstat>qstat @JOBID@ || echo DONE</qstat>
  </host>

  <email>
    <to>sserbin@@bnl.gov</to>
  </email>

</pecan.multi>

Still trying to work through it

error stack

> traceback()
11: `[[.MultiSettings`(item, setting, setAttributes = T)
10: item[[setting, setAttributes = T]]
9: listToXml.MultiSettings(settings, "pecan")
8: listToXml(settings, "pecan")
7: saveXML(listToXml(settings, "pecan"), file = pecanfile)
6: PEcAn.settings::write.settings(settings, outputfile = "pecan.CHECKED.xml") at workflow.R#65
5: eval(ei, envir)
4: eval(ei, envir)
3: withVisible(eval(ei, envir))
2: source("workflow.R")
1: source("workflow.R")
>

in here: https://github.com/PecanProject/pecan/blob/08b5c8a239b4dd4a2f79fa48d497742fee7bac99/base/settings/R/MultiSettings.R#L167

failing here: https://github.com/PecanProject/pecan/blob/08b5c8a239b4dd4a2f79fa48d497742fee7bac99/base/settings/R/MultiSettings.R#L92

putting this info here to keep track

serbinsh commented 5 years ago

Tried updating based on docs:

  <!-- run tags and options -->
  <run>
    <multisettings>
      <multisettings>run</multisettings>
      <multisettings>ensemble</multisettings>
    </multisettings> 
    <inputs>
      <met>
         <source>CRUNCEP</source>
         <output>SIPNET</output>
      </met>
    </inputs>
    <start.date>1980/01/01</start.date>
    <end.date>2010/12/31</end.date>
  </run>

still crashes in the same spot

serbinsh commented 5 years ago

Modified line 92 of MultiSettings.R to

##' @export
"[[.MultiSettings" <- function(x, i, collapse = TRUE, setAttributes = FALSE) {
  if (is.character(i)) {
    result <- lapply(x, function(y) y[[i]])
    if (collapse && .allListElementsEqual(result)) {
       result <- result[[1]]
    #  if (setAttributes) {
    #    attr(result, "settingType") <- "global"
    #  }
    #} else {
    #  if (setAttributes) {
    #    attr(result, "settingType") <- "multi"
    #  }
    #}
      if (setAttributes) {
        attr(result, "settingType") <- "multi"
      }
    }
    return(result)
  } else {
    NextMethod()
  }
} # "[[.MultiSettings"

same error bnut this time complains about setting something that is NULL to multi:

Error in attr(result, "settingType") <- "multi" :
  attempt to set an attribute on NULL
> traceback()
10: `[[.MultiSettings`(item, setting, setAttributes = T)
9: item[[setting, setAttributes = T]]
8: listToXml.MultiSettings(settings, "pecan")
7: listToXml(settings, "pecan")
6: saveXML(listToXml(settings, "pecan"), file = pecanfile)
5: PEcAn.settings::write.settings(settings, outputfile = "pecan.CHECKED.xml") at workflow.R#65
4: eval(ei, envir)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("workflow.R")

I dont know enough about this string of functions and what they are doing so may be reaching my limit to sort out the reason this is failing.

serbinsh commented 5 years ago

Here is what my STATUS looks like

2019-03-12 12:47:22 ERROR   
2019-03-12 13:04:06 ERROR   
serbinsh commented 5 years ago

OK @bailsofhay @para2x here is some more:

I updated the function as:

##' @export
"[[.MultiSettings" <- function(x, i, collapse = TRUE, setAttributes = FALSE) {
  if (is.character(i)) {
    result <- lapply(x, function(y) y[[i]])
    PEcAn.logger::logger.debug(result)
    if (collapse && .allListElementsEqual(result)) {
       result <- result[[1]]
       PEcAn.logger::logger.debug(result)
    #  if (setAttributes) {
    #    attr(result, "settingType") <- "global"
    #  }
    #} else {
    #  if (setAttributes) {
    #    attr(result, "settingType") <- "multi"
    #  }
    #}
      if (setAttributes) {
         PEcAn.logger::logger.debug(setAttributes)
         attr(result, "settingType") <- "multi"
      }
    }
    return(result)
  } else {
    NextMethod()
  }
} # "[[.MultiSettings"

to get more feedback and this is the result

2019-03-12 13:08:24 INFO   [check.model.settings] :
   Setting model binary to /data/software/SIPNET/sipnet_r136/sipnet.r136
2019-03-12 13:08:24 INFO   [fn] : path
2019-03-12 13:08:24 INFO   [fn] : path
2019-03-12 13:08:24 INFO   [fn] :
   Missing optional input : poolinitcond
2019-03-12 13:08:24 INFO   [check.workflow.settings] :
   output folder = /data/sserbin/Modeling/sipnet/multi_site/testrun.11
2019-03-12 13:08:24 INFO   [check.settings] :
   Storing pft temperate.needleleaf.evergreen in
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11/pft/temperate.needleleaf.evergreen
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] : NULL NULL NULL NULL
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] : TRUE
Error in attr(result, "settingType") <- "multi" :
  attempt to set an attribute on NULL
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
   list(name = "localhost", scratchdir = "/scratch", prerun = "module load
   gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland
   hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3
   proj/5.1.0 gdal/2.3.1", qsub = "qsub -l walltime=36:00:00 -V -N @NAME@
   -o @STDOUT@ -e @STDERR@ -S /bin/bash", qsub.jobid =
   "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat = "qstat @JOBID@ || echo
   DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/out")
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
   localhost /scratch module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25
   python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540
   libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1 qsub -l walltime=36:00:00
   -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash
   ([[:digit:]]+)\.modex\.bnl\.gov qstat @JOBID@ || echo DONE
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11/run
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11/out
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
   list(name = "localhost", scratchdir = "/scratch", prerun = "module load
   gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland
   hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3
   proj/5.1.0 gdal/2.3.1", qsub = "qsub -l walltime=36:00:00 -V -N @NAME@
   -o @STDOUT@ -e @STDERR@ -S /bin/bash", qsub.jobid =
   "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat = "qstat @JOBID@ || echo
   DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.11/out")
2019-03-12 13:08:24 DEBUG  [`[[.MultiSettings`] :
   localhost /scratch module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25
   python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540
   libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1 qsub -l walltime=36:00:00
   -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash
   ([[:digit:]]+)\.modex\.bnl\.gov qstat @JOBID@ || echo DONE
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11/run
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11/out
>
serbinsh commented 5 years ago

Confirmed, result in the if(setAttributes) {} step is NULL, though it is populated prior to that if statement??

2019-03-12 13:13:28 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.11
2019-03-12 13:13:28 DEBUG  [`[[.MultiSettings`] : NULL NULL NULL NULL
2019-03-12 13:13:28 DEBUG  [`[[.MultiSettings`] :
2019-03-12 13:13:28 DEBUG  [`[[.MultiSettings`] : setAttributes: TRUE
2019-03-12 13:13:28 DEBUG  [`[[.MultiSettings`] :
   setAttributes result:
Error in attr(result, "settingType") <- "multi" :
  attempt to set an attribute on NULL
serbinsh commented 5 years ago

/data/sserbin/Modeling/sipnet/multi_site/testrun.11 == result[[1]]

within

if (collapse && .allListElementsEqual(result)) {}

but empty in

      if (setAttributes) {
         PEcAn.logger::logger.debug(paste0("setAttributes: ",setAttributes))
         PEcAn.logger::logger.debug(paste0("setAttributes result: ",result))
         attr(result, "settingType") <- "multi"
      }
    }
serbinsh commented 5 years ago
      if (setAttributes) {
         PEcAn.logger::logger.debug(paste0("setAttributes: ",setAttributes))
         PEcAn.logger::logger.debug(paste0("setAttributes result: ",result))
         str(result)
         attr(result, "settingType") <- "multi"
      }
2019-03-12 13:16:27 DEBUG  [`[[.MultiSettings`] :
   setAttributes result:
 NULL
Error in attr(result, "settingType") <- "multi" :
  attempt to set an attribute on NULL

so you can see result is NULL in that if.....

serbinsh commented 5 years ago

its clearly trying to do this:

> result = '/data/sserbin/Modeling/sipnet/multi_site/testrun.11'
> attr(result, "settingType") <- "multi"
> result
[1] "/data/sserbin/Modeling/sipnet/multi_site/testrun.11"
attr(,"settingType")
[1] "multi"
>
serbinsh commented 5 years ago
> dump.log
$`source("workflow.R")`
<environment: 0x29af420>

$`withVisible(eval(ei, envir))`
<environment: 0xcc73ff8>

$`eval(ei, envir)`
<environment: 0xcc73e38>

$`eval(ei, envir)`
<environment: R_GlobalEnv>

$`workflow.R#65: PEcAn.settings::write.settings(settings, outputfile = "pecan`
<environment: 0xcc79258>

$`saveXML(listToXml(settings, "pecan"), file = pecanfile)`
<environment: 0xcdc5858>

$`listToXml(settings, "pecan")`
<environment: 0xcdca000>

$`listToXml.MultiSettings(settings, "pecan")`
<environment: 0xcde2c50>

$`item[[setting, setAttributes = T]]`
<environment: 0xce2fb88>

$``[[.MultiSettings`(item, setting, setAttributes = T)`
<environment: 0xce2f840>

$`workflow.R#22: PEcAn.remote::kill.tunnel(settings)`
<environment: 0xd235990>

$`settings$host`
<environment: 0xd969820>

$``$.MultiSettings`(settings, "host")`
<environment: 0xd969580>

$`x[[i]]`
<environment: 0xd969430>

$``[[.MultiSettings`(x, i)`
<environment: 0xd969120>

$`PEcAn.logger::logger.debug(k)`
<environment: 0xda1ce30>

$`logger.message("DEBUG", msg, ...)`
<environment: 0xda1cc38>

attr(,"error.message")
[1] "Error in attr(result, \"settingType\") <- \"multi\" : \n  attempt to set an attribute on NULL\n"
attr(,"class")
[1] "dump.frames"
serbinsh commented 5 years ago
##' @export
"[[.MultiSettings" <- function(x, i, collapse = TRUE, setAttributes = FALSE) {
  if (is.character(i)) {
    result <- lapply(x, function(y) y[[i]])
    PEcAn.logger::logger.debug(result)
    if (collapse && .allListElementsEqual(result)) {
       result <- result[[1]]
       PEcAn.logger::logger.debug(result)
       PEcAn.logger::logger.debug("str(result)")
       str(result)
       #result2 <- result
    #  if (setAttributes) {
    #    attr(result, "settingType") <- "global"
    #  }
    #} else {
    #  if (setAttributes) {
    #    attr(result, "settingType") <- "multi"
    #  }
    #}
      #if (setAttributes) {
      #   PEcAn.logger::logger.debug(paste0("setAttributes: ",setAttributes))
      #   PEcAn.logger::logger.debug(paste0("setAttributes result: ",result))
      #   #result <- '/data/sserbin/Modeling/sipnet/multi_site/testrun.12'
      #   str(result)
      #   attr(result, "settingType") <- "multi"
      #}
      attr(result, "settingType") <- "multi"
    }
    return(result)
> settings$host
2019-03-12 14:49:14 DEBUG  [`[[.MultiSettings`] :
   list(name = "localhost", scratchdir = "/scratch", prerun = "module load
   gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland
   hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3
   proj/5.1.0 gdal/2.3.1", qsub = "qsub -l walltime=36:00:00 -V -N @NAME@
   -o @STDOUT@ -e @STDERR@ -S /bin/bash", qsub.jobid =
   "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat = "qstat @JOBID@ || echo
   DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out")
2019-03-12 14:49:14 DEBUG  [`[[.MultiSettings`] :
   localhost /scratch module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25
   python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540
   libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1 qsub -l walltime=36:00:00
   -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash
   ([[:digit:]]+)\.modex\.bnl\.gov qstat @JOBID@ || echo DONE
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12/run
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12/out
2019-03-12 14:49:14 DEBUG  [`[[.MultiSettings`] : str(result)
2019-03-12 14:49:14 DEBUG  [`[[.MultiSettings`] : 1
List of 8
 $ name      : chr "localhost"
 $ scratchdir: chr "/scratch"
 $ prerun    : chr "module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 "| __truncated__
 $ qsub      : chr "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash"
 $ qsub.jobid: chr "([[:digit:]]+)\\.modex\\.bnl\\.gov"
 $ qstat     : chr "qstat @JOBID@ || echo DONE"
 $ rundir    : chr "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run"
 $ outdir    : chr "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out"
$name
[1] "localhost"

$scratchdir
[1] "/scratch"

$prerun
[1] "module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1"

$qsub
[1] "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash"

$qsub.jobid
[1] "([[:digit:]]+)\\.modex\\.bnl\\.gov"

$qstat
[1] "qstat @JOBID@ || echo DONE"

$rundir
[1] "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run"

$outdir
[1] "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out"

attr(,"settingType")
[1] "multi"

the above change adds the attribute to host but the same crash

2019-03-12 14:53:40 INFO   [fn] :
   Missing optional input : poolinitcond
2019-03-12 14:53:40 INFO   [check.workflow.settings] :
   output folder = /data/sserbin/Modeling/sipnet/multi_site/testrun.12
2019-03-12 14:53:40 INFO   [check.settings] :
   Storing pft temperate.needleleaf.evergreen in
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12/pft/temperate.needleleaf.evergreen
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] : str(result)
 chr "/data/sserbin/Modeling/sipnet/multi_site/testrun.12"
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] : NULL NULL NULL NULL
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] : str(result)
 NULL
Error in attr(result, "settingType") <- "multi" :
  attempt to set an attribute on NULL
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] : str(result)
 chr "/data/sserbin/Modeling/sipnet/multi_site/testrun.12"
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
   list(name = "localhost", scratchdir = "/scratch", prerun = "module load
   gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland
   hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3
   proj/5.1.0 gdal/2.3.1", qsub = "qsub -l walltime=36:00:00 -V -N @NAME@
   -o @STDOUT@ -e @STDERR@ -S /bin/bash", qsub.jobid =
   "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat = "qstat @JOBID@ || echo
   DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out")
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
   localhost /scratch module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25
   python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540
   libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1 qsub -l walltime=36:00:00
   -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash
   ([[:digit:]]+)\.modex\.bnl\.gov qstat @JOBID@ || echo DONE
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12/run
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12/out
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] : str(result)
List of 8
 $ name      : chr "localhost"
 $ scratchdir: chr "/scratch"
 $ prerun    : chr "module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 "| __truncated__
 $ qsub      : chr "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash"
 $ qsub.jobid: chr "([[:digit:]]+)\\.modex\\.bnl\\.gov"
 $ qstat     : chr "qstat @JOBID@ || echo DONE"
 $ rundir    : chr "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run"
 $ outdir    : chr "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out"
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
   list(name = "localhost", scratchdir = "/scratch", prerun = "module load
   gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland
   hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3
   proj/5.1.0 gdal/2.3.1", qsub = "qsub -l walltime=36:00:00 -V -N @NAME@
   -o @STDOUT@ -e @STDERR@ -S /bin/bash", qsub.jobid =
   "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat = "qstat @JOBID@ || echo
   DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out") list(name =
   "localhost", scratchdir = "/scratch", prerun = "module load gcc/5.4.0
   jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540
   netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1",
   qsub = "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@
   -S /bin/bash", qsub.jobid = "([[:digit:]]+)\\.modex\\.bnl\\.gov", qstat
   = "qstat @JOBID@ || echo DONE", rundir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run", outdir =
   "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out")
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] :
   localhost /scratch module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25
   python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540
   libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1 qsub -l walltime=36:00:00
   -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash
   ([[:digit:]]+)\.modex\.bnl\.gov qstat @JOBID@ || echo DONE
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12/run
   /data/sserbin/Modeling/sipnet/multi_site/testrun.12/out
2019-03-12 14:53:40 DEBUG  [`[[.MultiSettings`] : str(result)
List of 8
 $ name      : chr "localhost"
 $ scratchdir: chr "/scratch"
 $ prerun    : chr "module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 "| __truncated__
 $ qsub      : chr "qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash"
 $ qsub.jobid: chr "([[:digit:]]+)\\.modex\\.bnl\\.gov"
 $ qstat     : chr "qstat @JOBID@ || echo DONE"
 $ rundir    : chr "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/run"
 $ outdir    : chr "/data/sserbin/Modeling/sipnet/multi_site/testrun.12/out"
serbinsh commented 5 years ago

Its like it is running it repeatedly and crashes the first time but then works but the error hangs the workflow. still trying to understand how these functions are supposed to work.

serbinsh commented 5 years ago

@femeunier @tonygardella @robkooper @ashiklom is it possible that the XML library version running on modex could be the issue such that it causes an error between what is installed at BU/VM/Docker? Grasping at straws

serbinsh commented 5 years ago

After a long debug, with the help of @femeunier it seems the issue was likely something to do with a text editor causing strange line ending or extra character in the xml file. This could be an issue in other cases as well, so something to beware of.

Here is my working XML pecan.xml.txt

<?xml version="1.0"?>
<pecan>
  <outdir>/data/sserbin/Modeling/sipnet/multi_site/testrun.17</outdir>
  <database>
    <bety>
      <user>bety</user>
      <password>bety</password>
      <host>localhost</host>
      <port>5432</port>
      <dbname>bety</dbname>
      <driver>PostgreSQL</driver>
      <write>FALSE</write>
    </bety>
    <dbfiles>/data/pecan_dbfiles</dbfiles>
  </database>
  <pfts>
    <pft>
      <name>temperate.needleleaf.evergreen</name> 
      <constants>
        <num>1</num>
      </constants>
    </pft>
  </pfts>
   <meta.analysis>
    <iter>3000</iter>
    <random.effects>FALSE</random.effects>
  </meta.analysis>
  <ensemble>
   <size>10</size>
   <variable>NPP</variable>
   <samplingspace>
   <parameters>
    <method>lhc</method>
   </parameters>
   <met>
       <method>sampling</method>
   </met>
   </samplingspace>
  </ensemble>
  <model>
    <id>1000000014</id>
  </model>
  <sitegroup>
      <id>2000000006</id>
  </sitegroup> 
    <run>
    <multisettings>
      <multisettings>run</multisettings>
      <multisettings>ensemble</multisettings>
    </multisettings>
    <inputs>
      <met>
         <source>CRUNCEP</source>
         <output>SIPNET</output>
      </met>
    </inputs>
    <start.date>1980/01/01</start.date>
    <end.date>2010/12/31</end.date>
  </run>
  <host>
    <name>localhost</name>
    <scratchdir>/scratch</scratchdir>
    <prerun>module load gcc/5.4.0 jags/4.3.0 udunits/2.2.25 python/2.7.14 redland hdf5/1.8.19-gcc540 netcdf/4.4.1.1-gnu540 libtiff/4.0.8 geos/3.6.3 proj/5.1.0 gdal/2.3.1</prerun>
    <qsub>qsub -l walltime=36:00:00 -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub>
    <qsub.jobid>([[:digit:]]+)\.modex\.bnl\.gov</qsub.jobid>
    <qstat>qstat @JOBID@ || echo DONE</qstat>
  </host>
  <email>
    <to>sserbin@bnl.gov</to>
  </email>
</pecan>

So I think I will close this now

serbinsh commented 5 years ago

Fixed - see https://github.com/PecanProject/pecan/issues/2319