Closed cwhanse closed 6 years ago
I often put metadata in a pandas Series rather than dict because this gives me order. I also like the dot-style access, e.g. "meta.latitude".
As @bmu once mentioned, naming our new module io is probably asking for trouble because it conflicts with the builtin python io module. I'm open to dataio, datareader, inputoutput, or other suggestions.
So far, we've only discussed creating a new module. @mikofski already complained in #235 that our modules are too long, and adding a bunch of new code to a unified io module might make it worse. I wonder if we might be better off in the long run if we instead create a subpackage comprised of one module per data type and pull the functions into the package-level namespace. This is similar to the way that the pandas io package is structured. So, we'd have something like
pvlib/
dataio/
__init__.py
surfrad.py
tmy.py
...
### pvlib/dataio/__init__.py ###
from pvlib.dataio.surfrad import read_surfrad
from pvlib.dataio.tmy import read_tmy2, read_tmy3
### usage ###
import pvlib
data, metadata = pvlib.dataio.read_surfrad
data2, metadata2 = pvlib.dataio.read_tmy2
This subpackage structure might make for an easy to use API and a set of easy to read/maintain modules.
Probably a good idea to stick with the (data, metadata) pattern for new reader functions.
I cannot think of a downside to returning metadata as a Series, but my instinct is to leave it as a dict (possibly an OrderedDict) and let users convert it to a Series if they want to. I seem to remember that the pandas developers recommend against using Series as a dictionary, though I can't find the reference and could be making that up. All that being said, I'm not opposed to the change.
It’s a plan, although I prefer iotools to dataio.
Function pattern to be the same as pandas, i.e., read(format) and to(format) as needed.
I defer to people with more experience on the code organization, but I do suspect there will be a few potentially shared elements. What I could imagine--and would really like to have--is a kind of enhanced read_csv that can read extra information in the header and/or footer in addition to the regular columnar data. Many file formats could build on that just by setting parameters and translating column or key names. An option could be to return the header and/or footer as a text block to be parsed later.
@adriesse do you have an example file & code?
How would you implement with the file irradiation-0e2a19f2-abe7-11e5-a880-5254002dbd9b.csv
In most cases, the input will requrie an adaption of the dateparser to the specific data format. This usually gets you going. Or would the scope in addition include quality checks or reformatting on the data? I am still not clear what the output is desired to be.
In the latter case, what do we with additional variables e.g. in spectral datasets or in BSRN files, for example?
Hi all, sorry I haven't chimed in. I already have tmy2 and tmy3 readers. I'll send them when I get in.
AFA the dataio module and namespace question.
irradiance.py
to a folder called irradiance
and put an __init__.py
in it. Then import the symbols that were originally imported directly from irradiance.py
but are now in modules like irradiance/spa.py
pvlib.data.io
can't be confused with the built-in io
because the long name that precedes it.Mark, I am confused about your tmy readers. Do they do the same thing as pvlib's existing tmy readers?
Let's take things one step at a time and have Cliff just make us a dataio/iotools
subpackage for the IO capabilities. I have been thinking about larger reorganizations, though, will eventually comment more in #235.
If necessary, the subpackage could have a core.py
or its own tools.py
module for code that is reused throughout the package.
I want to keep the bar for contributing to pvlib as low as possible, so it's fine with me if different modules do different amounts of processing to their respective data formats. Some might be 1000 line monsters that do QA and return data that is in "pvlib conventions," while others might not do much more than call pd.read_csv
with a couple of arguments that are unique to that data format.
@wholmgren , sorry I misunderstood the intention of the proposed dataio
, and also I wasn't aware of the readtmy2
and readtmy3
methods in tmy.py
. So please disregard (2) in my comment above.
I agree with you that the bar for contributions should be low. Your ideas for core.py
and tools.py
sound good, although if I understand correctly (?) that approach may break some code if it shift things from say pvlib.irradiance
to pvlib.core.irradiance
so that's something to consider. IMHO changing the top level modules to subpackages would preserve the current API, but allow more flexibility. See my comment on #235
dataio
Cliff, some ideas for formats that could be read are:
pvsystem.retrieve_sam
method, so similar to TMY methods, would you relocate them to dataio
?retrieve_sam
.@cwhanse / @mikofski many items of the last comments are already in https://github.com/pvlib/pvlib-python/issues/29 are they complementary? shall the other be closed?
I wonder if there are any observations regarding the comments in https://github.com/pvlib/pvlib-python/issues/261#issuecomment-260628285
Otherwise, who will provide a prototype or will have a PR each one with the data set reader they have implemented?
@cwhanse please have a look at the PR
The data is here: cd MY-GITHUB-CLONES git clone https://github.com/dacoex/pvlib_data
@cwhanse & @adriesse my refractored PR is up:
@wholmgren seems to lead towards keeping the iotools simple and as a customised wrapper for pandas.read_csv
Shall iotools/util also provide functions to:
So what shall be the outputs?
I personally prefer to have the library do as much as possible:
I am looking forward to your feedback & will then modify my PRs according to what seems to be consensus.
I understand why you would like to handle the metadata with this library.
I wonder if the problem is that not all file types include all metadata? For example the PVsyst output file doesn't include any timezone or lat/long information, because it just references a .SIT file that includes that information. Not sure how to handle this other than to have the user define a Location separately to use.
Jessica
On Mon, Dec 5, 2016 at 8:57 AM, DaCoEx notifications@github.com wrote:
@cwhanse https://github.com/cwhanse & @adriesse https://github.com/adriesse my refractored PR is up:
- maccrad #279 https://github.com/pvlib/pvlib-python/pull/279
- pvsyst #280 https://github.com/pvlib/pvlib-python/pull/280
@wholmgren https://github.com/wholmgren seems to lead towards keeping the iotools simple and as a customised wrapper for pandas.read_csv
Shall iotools/util also provide functions to:
- read metadata, usually in the lines before column header: coordinates, time reference?
- localise the UTC based timeseries as suggested: 4 days ago https://github.com/pvlib/pvlib-python/pull/270#issuecomment-264102908
So what shall be the outputs?
- a dataframe with the raw data?
- a dataframe with renamed columns to match pvlib convention?
- a metadata dictionary?
- a location with timezone, which involves usually retrieving this info either from geonames or using an additional package.
- a dataframe with renamed columns to match pvlib convention & localised index?
I personally prefer to have the library do as much as possible:
- read data
- reformat data
- prepare a Location
- localise data
I am looking forward to your feedback & will then modify my PRs according to what seems to be consensus.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pvlib/pvlib-python/issues/261#issuecomment-264910082, or mute the thread https://github.com/notifications/unsubscribe-auth/AH66AcfaCRnSZDxoPpBb9fNqf6VM-NVaks5rFEKIgaJpZM4KweNy .
I wonder if the problem is that not all file types include all metadata?
Yes. But most scientific providers at least include information on time reference, i.e. UTC. So with this and the coordinates, we could derive the timezine and calculate the location for the typical input meto files.
Timezone is derived either by web query to geonames or local libraries.
For example the PVsyst output file doesn't include any timezone or lat/long information, because it just >references a .SIT file that includes that information. Not sure how to handle this other than to have the >user define a Location separately to use.
Correct. Actually, in the case of PVSyst, I would assume that a location exists because PVSyst hourly output usually does not include GHI. I would assume that a pvlib user uses this data to compare the result of both modelling environments.
So my current proposal for addressing this in iotools:
pvlib.iotools.FORMAT
reader would be a tuple with
pvlib.iotools.util
to include some tools
So depening on the dataset, the user coud employ the functions in util
for more convenience.
The capabilities of each format / reader could be documented in the docstring.
Yes, having tools available for the user to apply sounds like the right approach.
On Tue, Dec 6, 2016 at 2:05 AM, DaCoEx notifications@github.com wrote:
I wonder if the problem is that not all file types include all metadata?
Yes. But most scientific providers at least include information on time reference, i.e. UTC. So with this and the coordinates, we could derive the timezine and calculate the location for the typical input meto files.
Timezone is derived either by web query to geonames or local libraries.
For example the PVsyst output file doesn't include any timezone or lat/long information, because it just >references a .SIT file that includes that information. Not sure how to handle this other than to have the >user define a Location separately to use.
Correct. Actually, in the case of PVSyst, I would assume that a location exists because PVSyst hourly output usually does not include GHI. I would assume that a pvlib user uses this data to compare the result of both modelling environments.
So my current proposal would be address in iotools:
Standard output of a pvlib.iotools.FORMAT reader would be a tuple with
- dataframe with raw data (for comparison & debugging)
- dataframe with renamed columns to match pvlib convention
- metadata dictionary as suggested by @wholmgren https://github.com/wholmgren in iotools: reader for maccrad #279 https://github.com/pvlib/pvlib-python/pull/279#discussion_r90881672
pvlib.iotools.util to include some tools
- optional tool to retrieve the timezone
- optional tool to localise dataframe
- further functions, e.g. checker if the radiation starts before sunrise to inform that there is a timezone issue
So depening on the dataset, the user coud employ the functions in util for more convenience. The capabilities of each format / reader could be documented in the docstring.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pvlib/pvlib-python/issues/261#issuecomment-265110051, or mute the thread https://github.com/notifications/unsubscribe-auth/AH66AdRkKTYFZ3SIqrAi3gUMElcw5CQXks5rFTOAgaJpZM4KweNy .
try tzwhere to determine timezone from coordinates:
from tzwhere import tzwhere
from datetime import datetime
import pytz
# timezone lookup, force nearest tz for coords outside of polygons
WHERETZ = tzwhere.tzwhere(shapely=True, forceTZ=True)
# daylight savings time (DST) in northern hemisphere starts in March and ends
# in November and the opposite in southern hemisphere
JAN1 = datetime(2016, 1, 1) # date with standard time in northern hemisphere
JUN1 = datetime(2016, 6, 1) # date with standard time in southern hemisphere
# notes and links on Python datetime tzinfo dst:
# http://stackoverflow.com/questions/17173298/is-a-specific-timezone-using-dst-right-now
# http://stackoverflow.com/questions/19774709/use-python-to-find-out-if-a-timezone-currently-in-daylight-savings-time
# methods for determining DST vs. STD
# tz.localize(JAN1).timetuple().tm_isdst -> boolean, True if DST, False if STD
# tz.localize(JAN1).dst() -> timedelta, 0 if STD
# tz.dst(JAN1) -> timedelta, 0 if STD
def tz_latlon(lat, lon):
"""
Timezone from latitude and longitude.
:param lat: latitude [deg]
:type lat: float
:param lon: longitude [deg]
:type lon: float
:return: timezone
:rtype: float
"""
# get name of time zone using tzwhere, force to nearest tz
tz_name = WHERETZ.tzNameAt(lat, lon, forceTZ=True)
# check if coordinates are over international waters
if not tz_name or tz_name in ('uninhabited', 'unknown'):
# coordinates over international waters only depend on longitude
return lon // 15.0
else:
tz_info = pytz.timezone(tz_name) # get tzinfo
# get the daylight savings time timedelta
tz_date = JAN1 # standard time in northern hemisphere
if tz_info.dst(tz_date):
# if DST timedelta is not zero, then it must be southern hemisphere
tz_date = JUN1 # a standard time in southern hemisphere
tz_str = tz_info.localize(tz_date).strftime('%z') # output timezone from ISO
# convert ISO timezone string to float, including partial timezones
return float(tz_str[:3]) + float(tz_str[3:]) / 60.0, tz_info
if __name__ == "__main__":
# test tz_latlon at San Francisco (GMT+8.0)
gmt, tz_info = tz_latlon(37.7, -122.4)
assert gmt == -8.0
assert tz_info.zone == 'America/Los_Angeles'
assert tz_info.utcoffset(JAN1).total_seconds()/3600.0 == -8.0
assert tz_info.utcoffset(JUN1).total_seconds()/3600.0 == -7.0
# New_Delhi, India (GMT-5.5)
gmt, tz_info = tz_latlon(28.6, 77.1)
assert gmt == 5.5
assert tz_info.zone == 'Asia/Kolkata'
assert tz_info.utcoffset(JAN1).total_seconds()/3600.0 == 5.5
assert tz_info.utcoffset(JUN1).total_seconds()/3600.0 == 5.5
# also works with integers (EG: Santa Cruz, CA)
gmt, tz_info = tz_latlon(37, -122)
assert gmt == -8.0
assert tz_info.zone == 'America/Los_Angeles'
You can call this like in the script:
gmt, tz_info = tz_latlon(37, -122)
# gmt: -8.0
# tz_info: <DstTzInfo 'America/Los_Angeles' LMT-1 day, 16:07:00 STD>
Also see: Difference between timezones America/Los_Angeles and US/Pacific and PST8PDT?
Warning Please do not use any of the timezones listed here in "other timezones" (besides UTC), they only exist for backward compatible reasons, and may expose erroneous behavior.
Note: Using Shapely adds some overhead, loading is slightly slower, but is more accurate especially near shorelines.
to the reverse geopy with ESRI ArcGIS works great. Registration with ESRI developer or ArcGIS public account is free.
Mapquest also has free developer accounts and can be used with geopy.
There are also several other geocoding services like Google and others.
@jforbess so I conclude you agree to the structure described above in issuecomment-265110051?
@mikofski Thanks for the code. I used a simpler library: iotools: reader for maccrad by dacoex · Pull Request #279 · pvlib/pvlib-python
But @wholmgren did not like the addition of an external dependency: issuecomment-264486119
This is why I propose above to make it optional.
Well, if there are no further ideas or suggestions, I will revise the PR according to the discussion.
@dacoex, yes, and I see that @wholmgren didn't like an iotools/util function to localize tz, instead recommending using the pandas function, which I understand, but part of me thinks that having a function in the api will help users be consistent in their usage. This may be a philosophical argument?
On Tue, Dec 6, 2016 at 2:35 PM, DaCoEx notifications@github.com wrote:
@jforbess https://github.com/jforbess so I conclude you agree to the structure described above in issuecomment-265110051 https://github.com/pvlib/pvlib-python/issues/261#issuecomment-265110051?
@mikofski https://github.com/mikofski Thanks for the code. I used a simpler library: iotools: reader for maccrad by dacoex · Pull Request #279 · pvlib/pvlib-python https://github.com/pvlib/pvlib-python/pull/279/files/112b7bddc8e9ab554d01f8d79178d56d975e11b4#diff-88fd35c9724ff076feeb9cae8cebeeb9R10
But @wholmgren https://github.com/wholmgren did not like the addition of an external dependency: issuecomment-264486119 https://github.com/pvlib/pvlib-python/pull/274#issuecomment-264486119
This is why I propose above to make it optional.
Well, if there are no further ideas or suggestions, I will revise the PR according to the discussion.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pvlib/pvlib-python/issues/261#issuecomment-265295207, or mute the thread https://github.com/notifications/unsubscribe-auth/AH66AWrm97XkPsAeZ6GTW1woo3H2KgDIks5rFeMrgaJpZM4KweNy .
@dacoex I like timezonefinder please use pypi reference not github.
I disagree with @wholmgren #274, IMO it's fine to have dependencies, as long as they are mature, well documented and widely used, which timezonefinder is, since it's on pypi, v1.5.7 has over 2000 downloads, it's based on tzwhere, the previous best choice, it's been recently updated and had a steady record of releases, etc.
IMHO using open source is one of the reasons to use Python. PyPI is one of the reasons Python is so powerful. Adding dependencies instead of rolling your own makes your code stronger.
Perhaps you can selectively let users use (or not) the import by putting it in a try: except:
block.
Although, if possible perhaps it's good to try to make the dependency arbitrary, so you could switch it later. For example
def get _latlon(lat, lon, method, **kw):
"""Implemented timezone finder"""
if method is None:
try:
import timezonefinder
except:
return lon // 15.0
else:
return timezonefinder(lat, lon, **kw)
else:
return method(lat, lon, **kw)
Thanks to all for their feedback. I will propose an improved version and also add solargis format. Just give me some time because I just got loaded with a bit of work.
Here is another thought:
There are actually two kinds of meta data in your maccrad file: per-file metadata and per-column meta data (in this case descriptions and units). I have seen this in some other file formats as well, and it could be something worth reading and returning separately.
So what shall be the outputs?
a dataframe with the raw data?
I don't know what you mean by raw data in this context. The columns are unchanged?
a dataframe with renamed columns to match pvlib convention?
Probably, though would be great if this was an option with a rename=True default. Not sure if we should also change the existing tmy readers to be more consistent.
a metadata dictionary?
readtmy2 and readtmy3 return a tuple of metadata, data. Seems reasonable to me. Has anyone been unhappy with this in the past or wished that those readers did more?
a location...
I strongly oppose this. I think it is essential that pvlib's class layer lives above pvlib's functional layer, though, I will take all the blame if the distinction or the reasons for it are unclear. Returning a Location object would ruin the distinction. As discussed elsewhere, you could instead add a Location.from_maccrad(metadata)
method that does the job while retaining a clean separation among modules and between the class/functional layers. See Location.from_tmy
for inspiration. Location.from_maccrad(file)
could even be possible.
a dataframe with renamed columns to match pvlib convention & localised index?
I think that an IO reader should return a dataframe that is localized to the format of the timestamps of the file and/or the timezone specific metadata of the file. That is: if the timestamps say MST or -0700 then the dataframe should be localized accordingly, or, if the timestamps are ambiguous but the metadata says e.g. tz=MST then the dataframe should be localized. pvlib should not guess at the localization of the data.
Fine with me if you want to add a Location.guess_tz
or similar method to Location
. Within that method, you can import and run the optional library of your choosing or make a request to the google api.
pvlib.iotools.util to include some tools optional tool to retrieve the timezone
I can see an argument for Location.guess_tz()
with no arguments since it would look up its own self.latitude
and self.longitude
. Otherwise, I think it's better for users to make their own call to a library that gets the tz with one simple line of code.
optional tool to localise dataframe
I'm confused about the arguments surrounding creating our own localize/convert functions. pandas has a tz_localize
method and a tz_convert
method. They do exactly what their names suggest. I don't know how we can improve upon that.
further functions, e.g. checker if the radiation starts before sunrise to inform that there is a timezone issue
Sounds complicated to do reliably and probably unnecessary.
See Location.from_tmy for inspiration. Location.from_maccrad(file) could even be possible.
OK.
I don't know what you mean by raw data in this context. The columns are unchanged?
Yes, just the result of read_csv
.
I can see an argument for Location.guess_tz() with no arguments since it would look up its own >self.latitude and self.longitude.
Where would that be placed?
If put in location.py
it would add more overhead to this module. If kept in iotools/util,
this could be an optional module, i.e. not loaded on default by iotools/api
.
I still prefer to include a few shortcut functions in iotools/util
.
At least those which make sense.
Otherwise, I think it's better for users to make their own call to a library >that gets the tz with one >simple line of code.
Even the reader can be done by the user. But isn't this here also to make life more simple? If we are guessing the tz for most of the files, this it could be generalised in a common function to reduce the number of copy&paste code.
I would prefer to let the software do whatever can be done automatically. The result is to be verified anyway.
pandas has a tz_localize method and a tz_convert method. They do exactly what their names >suggest. I don't know how we can improve upon that.
Well, I took mine from pandas doc. but maybe they'd changed that api again?
@adriesse I suggest to add the column metadata to the docstring as in https://github.com/pvlib/pvlib-python/blob/io/pvlib/iotools/tmy.py#L73 This is also because it is not used for further calculations
Summarising, the next revision will be structured more closely like the existing tmy readers. This was totally overlooked in my initial code. Sorry if that had spurred an unnecessary discussion. But seems that it helped to get a common about this module.
As a learning, we may add a specification and some instructions on how to add a new reader to the docs.
On Wed, Dec 7, 2016 at 8:31 AM, Will Holmgren notifications@github.com wrote
further functions, e.g. checker if the radiation starts before sunrise to inform that there is a timezone issue
Sounds complicated to do reliably and probably unnecessary.
This may be unnecessary for standard files, but I have been wanting it for all of the data that I get from SCADA systems. Just found a system that didn't apply Daylight Savings on the actual day, but a week early. But only in the spring. And only the first two years. Otherwise, it had the right alignment with Daylight Savings.
I spent a lot of time wrapping my head around the right way to handle timezones because of daylight savings and the fact that my data sometimes comes with timestamps from a timezone that it is not located in. (A client pulls data in US/Eastern for a plant that is in California. The SCADA thinks it is doing the right thing, maybe, because it is relative to where the data is being queried, but it is not the right thing at all.)
But I admit, this shouldn't be an issue for any standard file that gets a standard reader. But it is critical if anyone tries to generalize iotools for a somewhat standard csv.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pvlib/pvlib-python/issues/261#issuecomment-265496432, or mute the thread https://github.com/notifications/unsubscribe-auth/AH66ATjzki4jLt5hI3u28wrM7Nxkl5plks5rFt9ngaJpZM4KweNy .
yes, @jforbess is right. this also applies to most datalogger files...
This issue seems a bit outdated, but anyway, I have just written a short function to read epw weather files from EnergyPlus. Note that the output is not harmonized with the output of the TMY functions. Feel free to re-use or integrate into the library: `
def readepw(filename=None):
'''
Reads an EPW file into a pandas dataframe.
Function tested with EnergyPlus weather data files:
https://energyplus.net/weather
Parameters
----------
filename : None or string
If None, attempts to use a Tkinter file browser. A string can be
a relative file path, absolute file path, or url.
Returns
-------
Tuple of the form (data, metadata).
data : DataFrame
A pandas dataframe with the columns described in the table
below.
metadata : dict
The site metadata available in the file.
Notes
-----
The returned structures have the following fields.
=======================================================================
Data field
=======================================================================
Datetime data
Dry bulb temperature in Celsius at indicated time
Dew point temperature in Celsius at indicated time
Relative humidity in percent at indicated time
Atmospheric station pressure in Pa at indicated time
Extraterrestrial horizontal radiation in Wh/m2
Extraterrestrial direct normal radiation in Wh/m2
Horizontal infrared radiation intensity in Wh/m2
Global horizontal radiation in Wh/m2
Direct normal radiation in Wh/m2
Diffuse horizontal radiation in Wh/m2
Averaged global horizontal illuminance in lux during minutes preceding the indicated time
Direct normal illuminance in lux during minutes preceding the indicated time
Diffuse horizontal illuminance in lux during minutes preceding the indicated time
Zenith luminance in Cd/m2 during minutes preceding the indicated time
Wind direction at indicated time. N=0, E=90, S=180, W=270
Wind speed in m/s at indicated time
Total sky cover at indicated time
Opaque sky cover at indicated time
Visibility in km at indicated time
Ceiling height in m
Present weather observation
Present weather codes
Precipitable water in mm
Aerosol optical depth
Snow depth in cm
Days since last snowfall
Albedo
Liquid precipitation depth in mm at indicated time
Liquid precipitation quantity
=======================================================================
=============== ====== ===================
key format description
=============== ====== ===================
altitude Float site elevation
latitude Float site latitudeitude
longitude Float site longitudeitude
Name String site name
State String state
TZ Float UTC offset
USAF Int USAF identifier
=============== ====== ===================
S. Quoilin, October 2017
'''
def _interactive_load():
import Tkinter
from tkFileDialog import askopenfilename
Tkinter.Tk().withdraw() #Start interactive file input
return askopenfilename()
if filename is None:
try:
filename = _interactive_load()
except:
raise Exception('Interactive load failed. Tkinter not supported on this system. Try installing X-Quartz and reloading')
head = ['dummy0', 'Name', 'dummy1', 'State', 'dummy2', 'USAF', 'latitude', 'longitude', 'TZ', 'altitude']
csvdata = open(filename, 'r')
# read in file metadata
temp = dict(zip(head, csvdata.readline().rstrip('\n').split(",")))
# convert metadata strings to numeric types
meta = {}
meta['Name'] = temp['Name']
meta['State'] = temp['State']
meta['altitude'] = float(temp['altitude'])
meta['latitude'] = float(temp['latitude'])
meta['longitude'] = float(temp['longitude'])
meta['TZ'] = float(temp['TZ'])
meta['USAF'] = int(temp['USAF'])
headers = ["year","month","day","hour","min","Dry bulb temperature in C","Dew point temperature in C","Relative humidity in percent","Atmospheric pressure in Pa","Extraterrestrial horizontal radiation in Wh/m2","Extraterrestrial direct normal radiation in Wh/m2","Horizontal infrared radiation intensity in Wh/m2","Global horizontal radiation in Wh/m2","Direct normal radiation in Wh/m2","Diffuse horizontal radiation in Wh/m2","Averaged global horizontal illuminance in lux during minutes preceding the indicated time","Direct normal illuminance in lux during minutes preceding the indicated time","Diffuse horizontal illuminance in lux during minutes preceding the indicated time","Zenith luminance in Cd/m2 during minutes preceding the indicated time","Wind direction. N=0, E=90, S=180, W=270","Wind speed in m/s","Total sky cover","Opaque sky cover","Visibility in km","Ceiling height in m","Present weather observation","Present weather codes","Precipitable water in mm","Aerosol optical depth","Snow depth in cm","Days since last snowfall","Albedo","Liquid precipitation depth in mm","Liquid precipitation quantity"]
Data = pd.read_csv(filename, skiprows=8,header=None)
del Data[5]
Data.columns = headers
Data.index = pd.to_datetime(Data[["year","month","day","hour"]])
Data = Data.tz_localize(int(meta['TZ']*3600))
return Data, meta
@squoilin - Silvain, this EPW reader function is quite useful. I'm incorporating it into my development of bifacialvf and bifacial_radiance at github.com/nrel/bifacialvf and github.com/nrel/bifacial_radiance. If/ when this gets pulled into the pvlib distribution, I'll switch over to the 'official' version. Thanks!
I'm assembling code to create pvlib.io.
What functionality is desired? Specifically, from which formats do we want to read, and to which formats do we want to write?
I'm going to focus first on reading a few data formats. I plan to incorporate the current pvlib.tmy which reads tmy2 and tmy3 files, returning a tuple (data, metadata) where data is a pandas DataFrame and metadata is a dict. Any comments on this design? Personally I like it.
I have code to read surfrad files so I'll include that.