fslaborg / Deedle

Easy to use .NET library for data and time series manipulation and for scientific programming
http://fslab.org/Deedle/
BSD 2-Clause "Simplified" License
924 stars 197 forks source link

Import from xlsx #429

Open alexpantyukhin opened 5 years ago

alexpantyukhin commented 5 years ago

The Pandas has read_excel for dataframe. I think that it would be good to have it in the Deedle.

zyzhu commented 5 years ago

Reading excel is a great productivity feature. I think we need to have more discussion about how to move forward on this issue.

pandas has optional dependency on xlrd to read excel files. I used to use OpenXml to read excel files. It is already on netstandard. But I'm not sure whether Deedle itself should have dependency on it or not.

Deedle.Excel currently depends on NetOffice and uses COM to interop with Excel. It's quite cool to have Excel open and Deedle frame directly shows up on the sheet. But COM interop will not work cross platform.

alexpantyukhin commented 5 years ago

I believe that having a separated dependency which provides import (and export in future) into Excel would be good. If consumer of the Deedle would like to work with Excel, there would be ability install it separately and use.

Actually I didn't look on the Deedle.Excel before (didn't know it exists :) ). I couldn't find sources of it.

zyzhu commented 5 years ago

@alexpantyukhin It's a very old pull #399. I just quickly merged it with new fsproj format. I've never used it before either. But I tried the sample script and it's quite cool to use. https://github.com/fslaborg/Deedle/blob/master/src/Deedle.Excel/Deedle.Excel.Sample.fsx

DGuidi commented 4 years ago

Deedle.Excel currently depends on NetOffice and uses COM to interop with Exce.lIt's quite cool to have Excel open and Deedle frame directly shows up on the sheet. But COM interop will not work cross platform.

Probably this can be resolved using ExcelDataReader or EPPlus: both are cross platform solutions, but the former one can only read excel xls + xlsx files, while the latter one can read excel xls + xlsx files and write xlsx files (and there's an upcoming license change for next EPPlus version). Both libraries are simpler but powerful when used to simply read(and write) data.