JGCRI / pygcam

Python tools for automating GCAM workflows and managing experiment configuration
Other
24 stars 11 forks source link

basex->csv #23

Open huanglin6385 opened 3 years ago

huanglin6385 commented 3 years ago

Hello: I want to process the output, the output basex folder is 'output/database_basexdb/' the query is 'input/queries.xml'

the code is like follow:

from pygcam.query import *

r = runBatchQuery(scenario='111', queryName='queries', queryPath='input/', outputDir='tf_test/', xmldb='output/database_basexdb/') 

the exception is like follow:

WARNING pygcam.query: runModelInterface called with both batchFile and csvFile; the latter is ignored. InterfaceMain: batchFile: queries-111.csv Running headless? true java.lang.NullPointerException at ModelInterface.ConfigurationEditor.utils.FileUtils.loadDocument(FileUtils.java:301) at ModelInterface.InterfaceMain.main(InterfaceMain.java:192)

As i am a new bird in pygcam, i can not find the tutorial of using 'queries.xml' to query in basex. Looking forward to your reply! Best wishes!

rjplevin commented 3 years ago

Sorry -- I initially misdiagnosed that. There was an error in the code that I'm fixing now. I'll push a new version of pygcam when I've debugged it.

However, you have called the function incorrectly. The queryName argument should be the name (the title attribute) of the query in the file you want to run, and the file can contain multiple queries. The queryPath can contain a delimited set of directories in which to look for queryName.xml or the name of an XML file containing queryName.

Still, using the function:

runMultiQueryBatch(scenario, queries, xmldb='', queryPath=None, outputDir=None,
                       miLogFile=None, regions=None, regionMap=None, rewriteParser=None,
                       batchFileIn=None, batchFileOut=None, noRun=False, noDelete=False)

which is used internally by pygcam's gt command will work, and it supports multiple queries in one invocation of ModelInterface, so it's more efficient. The other function (obviously) hasn't been used for a while.

rjplevin commented 3 years ago

I've pushed the bug fix in version 10.0.1 to git and the pip server. Thanks for reporting the error!

Note that your .pygcam.cfg file will need to be correct since some parameter values are taken from there. Contact me by email if you want to follow up. (rich@plevin.com).

rjplevin commented 3 years ago

FWIW, here's my test program. This script must be run from the exe directory since it uses a relative path to the output directory.


from pygcam.query import runBatchQuery

runBatchQuery(scenario='base',
              queryName='oil_supply',
              queryPath='/tmp/elasticity_queries.xml',
              outputDir='/tmp',
              xmldb='../output/database_basexdb')

Here are the contents of elasticity_queries.xml:

<?xml version="1.0" encoding="UTF-8"?>
<queries>
    <queryGroup name="markets and prices">
        <marketQuery title="oil price">
            <axis1 name="market">market</axis1>
            <axis2 name="Year">market</axis2>
            <xPath buildList="true" dataName="price" group="false" sumAll="false">Marketplace/market[true() and contains(@name,'crude oil')]/price/node()</xPath>
        </marketQuery>

        <marketQuery title="oil supply">
                <axis1 name="market">market</axis1>
                <axis2 name="Year">market</axis2>
                <xPath buildList="true" dataName="supply" group="false" sumAll="false">Marketplace/market[true() and contains(@name, 'crude oil')]/supply/node()</xPath>
         </marketQuery>

        <marketQuery title="USA_ethanol_price">
            <axis1 name="market">market</axis1>
            <axis2 name="Year">market</axis2>
            <xPath buildList="true" dataName="price" group="false" sumAll="false">Marketplace/market[true() and @name='USAethanol']/price/node()</xPath>
        </marketQuery>

        <marketQuery title="corn_price">
            <axis1 name="market">market</axis1>
            <axis2 name="Year">market</axis2>
            <xPath buildList="true" dataName="price" group="false" sumAll="false">Marketplace/market[true() and ends-with(@name,'regional corn')]/price/node()</xPath>
        </marketQuery>

        <marketQuery title="oilcrop_price">
            <axis1 name="market">market</axis1>
            <axis2 name="Year">market</axis2>
            <xPath buildList="true" dataName="price" group="false" sumAll="false">Marketplace/market[true() and ends-with(@name,'regional oilcrop')]/price/node()</xPath>
        </marketQuery>

    </queryGroup>
</queries>
huanglin6385 commented 3 years ago
>>> from pygcam.query import runBatchQuery

runBatchQuery(scenario='base',
              queryName='Primary Energy Consumption (Average Fossil Efficiency Conversion)',
              queryPath='//home/gcam/gcam-v4.4/output/queries/Main_queries.xml',
              outputDir='/home/gcam/gcam-v4.4/output',
              xmldb='/home/gcam/gcam-v4.4/output/database_basexdb')
>>>
>>>
>>>
>>> runBatchQuery(scenario='base',
...               queryName='Primary Energy Consumption (Average Fossil Efficiency Conversion)',
...               queryPath='//home/gcam/gcam-v4.4/output/queries/Main_queries.xml',
...               outputDir='/home/gcam/gcam-v4.4/output',
...               xmldb='/home/gcam/gcam-v4.4/output/database_basexdb')
InterfaceMain: batchFile: /tmp/tmpQ_FEbp.batch.xml
Running headless? true
ERROR; java.lang.NoClassDefFoundError: org/basex/core/Proc
java.lang.NoClassDefFoundError: org/basex/core/Proc
        at ModelInterface.ModelGUI2.DbViewer.runBatch(DbViewer.java:1853)
        at ModelInterface.InterfaceMain.runBatch(InterfaceMain.java:650)
        at ModelInterface.InterfaceMain.main(InterfaceMain.java:206)
Caused by: java.lang.ClassNotFoundException: org.basex.core.Proc
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 3 more
ERROR pygcam.query: Batch file '/tmp/tmp4nuxun.query.xml' failed. Deleting '/home/gcam/gcam-v4.4/output/Primary_Energy_Consumption_(Average_Fossil_Efficiency_Conversion)-base.csv'

image

It seems not works for me.

Thus I try another method, and I start basex docker from https://github.com/BaseXdb/, at the same time, I mount my output_database to the docker container.

Can you give me a tutorial about how to translate the Query.xml to basex query sentence! I will be deeply grateful!

rjplevin commented 3 years ago

Can you give me a tutorial about how to translate the Query.xml to basex query sentence

No, sorry. I have no experience with using BaseX with docker.

ERROR; java.lang.NoClassDefFoundError: org/basex/core/Proc

This suggests that your class path is incorrect, since the BaseX jar file hasn't been found.

Unfortunately, the configuration of GCAM and pygcam are complex enough that it is virtually impossible to debug remotely based on such limited information. As I suggested earlier, please contact me via email to follow up.