Closed bryangoodrich closed 9 years ago
Great thinking - these would fit nicely into the eemeter.importers
module, which has import_green_button_xml(filename)
, etc.
Thoughts on the data model? I'm a bit reluctant to require a particular data model unless it's already a widely accepted standard. Perhaps an alternative would be to provide an option for column mapping?
E.g.
import_xlsx(filename="...",mapping={"end_date":"Bill Date", "estimated":"Read Type", "usage":"Therms"}
On second thought, this is important enough for usability that it seems reasonable at first to temporarily impose a particular data model (and it may as well be simply the model we use internally, until a better suggestion comes along).
usage | unit_name | fuel_type | start | end | estimated |
---|---|---|---|---|---|
25.0 | therms | natural_gas | 2014-01-14 | 2014-02-10 | false |
34.6 | therms | natural_gas | 2013-12-16 | 2014-01-14 | false |
14.1 | therms | natural_gas | 2013-11-21 | 2013-12-16 | false |
I saw that you are able to pull in greenbutton data, so I assume you are looking for a minimum data standard for the usage readings, correct? Hpxml schema could work. How will oeem handle bulk/delivered fuels?
Sent from my Verizon Wireless 4G LTE DROID
Phil Ngo notifications@github.com wrote:
On second thought, this is important enough for usability that it seems reasonable at first to temporarily impose a particular data model (and it may as well be simply the model we use internally, until a better suggestion comes along).
usage unit_name fuel_type start end estimated 25.0 therms natural_gas 2014-01-14 2014-02-10 false 34.6 therms natural_gas 2013-12-16 2014-01-14 false 14.1 therms natural_gas 2013-11-21 2013-12-16 false
— Reply to this email directly or view it on GitHubhttps://github.com/impactlab/eemeter/issues/79#issuecomment-93097145.
IMPORTANT NOTICE: This email message is intended to be received only by persons entitled to receive the confidential information it may contain. Email messages to clients of Performance Systems Development may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system.
That's a good idea. I think the ConsumptionInfoType is what you're referring to? (http://hpxmlwg.github.io/hpxml/schemadoc/hpxml-2.0.0/index.html) I'm a bit familiar with the schema from writing a basic import function for HPXML - perhaps we could flatten the structure by combining ConsumptionType elements with the ConsumptionDetail elements, but keep all of the node names. That would set us up with something looking more like this:
Consumption | UnitofMeasure | FuelType | StartDateTime | EndDateTime | ReadingType |
---|---|---|---|---|---|
25.0 | therms | natural gas | 2014-01-14T00:00:00 | 2014-02-10T00:00:00 | total |
34.6 | therms | natural gas | 2013-12-16T00:00:00 | 2014-01-14T00:00:00 | estimated |
14.1 | therms | natural gas | 2013-11-21T00:00:00 | 2013-12-16T00:00:00 | total |
I'm going to move the delivered fuels topic to another issue since it's a pretty big topic. See #80.
a9ca272a3102fcc42d798553f7ecd8d04b4a3d10 creates a csv importer. Issues #81,#82 have been created to capture DB table imports and Excel imports.
Loading consumption data is not a terrible task, as shown in the tutorial. However, many internal data sources will have structured (tabular) systems from which they are being pulled, even if that's just a file. eemeter can impose a data model to fit a consumption object so one can import a file, spreadsheet, or database connection directly into a consumption history object. Maybe it should really be multiple importers:
import_file(filename)
,import_table(db connection)
,import_xlsx(filename)
, all requiring the same data model. Businesses could then use this package directly from their internal systems.