ZhongSiming commented 4 years ago

I have hourly data for 3 years using a datetime as the index. Pandas allows me load/save with the following code (only one month with 2 variables shown): `

Write data to .csv

jan90.to_csv('PEC fine course 1 grid 199001.csv', index=True)

Load data from .csv

jan90 = pd.read_csv('PEC fine course 1 grid 199001.csv', index_col=0, parse_dates=True) ` Using .csv works, but is slow when I get to the full dataset of 26k+ rows and 21.6k+ columns (and more columns may be coming if I have to add lags to my data). So, a more efficient load/save routine is very desirable. I was excited when I found feather, but the lost index is a no-go for my use.

Thanks for your consideration.

wesm commented 4 years ago

Could you open a JIRA issue in Apache Arrow (https://github.com/apache/arrow/blob/master/CONTRIBUTING.md)? That's where we're doing Feather development now.

ZhongSiming commented 4 years ago

Hi Wes, I opened a JIRA issue here: https://issues.apache.org/jira/browse/ARROW-7914?filter=-2

Please let me know if I need to do anything else.

Cheers, Sam

On Wed, Feb 19, 2020 at 2:55 AM Wes McKinney notifications@github.com wrote:

Could you open a JIRA issue in Apache Arrow ( https://github.com/apache/arrow/blob/master/CONTRIBUTING.md)? That's where we're doing Feather development now.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/wesm/feather/issues/386?email_source=notifications&email_token=AIROX35ARJQFXQ6W54SL7M3RDTQYDA5CNFSM4KXQQZ6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMGXDBQ#issuecomment-588083590, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIROX3YFKEN25I4G6EJFEVLRDTQYDANCNFSM4KXQQZ6A .

-- Samuel "Spock" Jones Engineering and Public Policy PhD Student Carnegie Mellon University (321)704-9790 shjones@cmu.edu

wesm commented 4 years ago

Thanks

wesm / feather

[Feature request] allow datetime index #386

Write data to .csv

Load data from .csv