pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.82k stars 17.99k forks source link

add methods to doc #1355

Closed timmie closed 12 years ago

timmie commented 12 years ago

Please add a list of sampling methods to the docs at:

http://pandas.pydata.org/pandas-docs/dev/timeseries.html#up-and-downsampling

changhiskhan commented 12 years ago

There are no separate upsample or downsample methods. Everything is done through "resample" with various frequency aliases specified in the documentation.

On Wed, May 30, 2012 at 5:18 PM, timmie < reply@reply.github.com

wrote:

Please add a list of sampling methods to the docs at:

http://pandas.pydata.org/pandas-docs/dev/timeseries.html#up-and-downsampling


Reply to this email directly or view it on GitHub: https://github.com/pydata/pandas/issues/1355

Chang She Lambda Foundry http://www.lambdafoundry.com

timmie commented 12 years ago

OK, but how do I find out what I can use as how argument In [956]: ts.resample('D', how='mean')?

changhiskhan commented 12 years ago

sum, mean, std, max, min, median, first, last, ohlc

note that you don't have to use strings at all if you supply your own method.

I'll add a note to the timeseries docs about this. Thanks

timmie commented 12 years ago

I also wondered where how='ohlc' comes from...

are you not using sphinx autodoc together with the docstrings?

wesm commented 12 years ago

OHLC is a pretty standard way to aggregate financial data (http://en.wikipedia.org/wiki/Open-high-low-close_chart)

timmie commented 12 years ago

yes, but wanted to suggest to point new users better to the possibilities that exists: which methods are available (standard, extended via numpy, etc.).

wesm commented 12 years ago

The docstring could use work. I'm hopeful that users (ahem) will help in this regard

timmie commented 12 years ago

Maybe such low hanging fruits could be tagged in the issues? Then, user would have the possibility to find easy fixes.

changhiskhan commented 12 years ago

Most "DOC" issues are similar low hanging fruit so I don't think it's necessary to have a distinct tag for this. We'd really welcome a pull request on docstrings :)

timmie commented 12 years ago

The docstring could use work. I'm hopeful that users (ahem) will help in this regard So I gained / provoked a post on twitter by my questions.

I can give you hope: users just have to find an entry point: aplicability to own work, enough functionality and suffiecient documentation to know how to make use of it.

When Pandas was published first time (more or less together with the larry packages) it dd not see much applicability for my codig. Two things came together: the discontinuation of the scikits.timeseries and your ambitions to develop the best python time series library with the bridge to statistsics. You can even not imagine the number applications and use case you would gain once this is more promoted in science and engineering.

Last but not least, the nature of open source shows that the interest and participation indeally also follow a leaning curve: first after publishing, you'll get mostly bug reports and feature request. Once the user base is there, people's contribution gets more colourful: some wirte docs, others new features, others just report bugs...

And pandas being a base libarary, many people will start building their libraries on top of it. together with statsmodels it's such a solid foundatdtion for data analysis, that other coders will then extend its capabilities domain specific (bioinformatics, earth observation, geoscience, etc.).

For my own libs (not yet published), I became quite good at docstring and Sphinx'ing. But first you have to understand how it works.

Also, I advice to add a link to the Numpy documentation standards to the contributors page.

Uff rather long here But I ain't got a blog or such...

wesm commented 12 years ago

I completely agree on all points. And don't worry, you aren't the first person to make tons of requests / suggestions but offer little help ;) Keep in mind how much coding work has gone into pandas over the last year and that I'm about 300 pages into a 400 page book on data analysis in Python. Plate rather full at the moment. Docs, etc will improve over time, and much faster if people who are not me or Chang get involved in the process-- we have to really be focused on shipping bug-free code and new features which is a much less accessible area of work for people to get involved (since grokking the pandas codebase, while not that complicated, is not a brief affair)

timmie commented 12 years ago

Most "DOC" issues are similar low hanging fruit so I don't think it's necessary to have a distinct tag for this. We'd really welcome a pull request on docstrings :)

I made also experience in the past that contributed where not added to the codebased due to

Consequently, I think #1370 could avoid such disapointment.