Open SteveBronder opened 6 years ago
Hi, thanks for the suggestion. If you're thinking of something like readxl's range
argument, then I agree, that would be good.
Presumably you've already tried reading one sheet at a time with the sheets
argument?
Yes I have. My problem is that the Workbook is 67 MB in size (yes yikes!). Calling an individual sheet still causes xlsx_cells()
to crash.
Actually I've found that the person who made these excel files dragged the formatting down to the last possible row over a bunch of columns. So my actual issue is that for each sheet xlsx_cells()
is trying to parse a ton of rows that only have formatting. So excel size is not really the issue, but having n_cols
and n_rows
arguments would be rad in solving this.
In my particular case I only need the first three rows or so.
I think a first step is to optionally omit blank cells. When readxl implemented range import it was complicated, and I want to take care to do it as similarly as possible.
That would be a nice solution for my problem!
On Fri, Apr 27, 2018 at 6:33 PM, Duncan Garmonsway <notifications@github.com
wrote:
I think a first step is to optionally omit blank cells. When readxl implemented range import it was complicated https://github.com/tidyverse/readxl/pull/314, and I want to take care to do it as similarly as possible.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nacnudus/tidyxl/issues/25#issuecomment-385110732, or mute the thread https://github.com/notifications/unsubscribe-auth/AFlfz4yiDAhOe65-c-vUWpxBVKKqN33Dks5ts5yggaJpZM4TcW1Y .
@SteveBronder blank cells can now be excluded on the master branch.
xlsx_cells(x, include_blank_cells = FALSE)
I'll keep this issue open for the range
feature.
Ty so much!!
I have a large excel file that causes
xlsx_cells
to crash. It would be nice to say, "Only get this many rows and this many columns" when calling xlsx_cells.Could something be put in
xlsxsheet::parseSheetData
?