read.xlsx() does not return the correct number of rows

This could be either a bug or a feature request, depending on your view.

Suppose I have an Excel sheet with the following data:

Row	Value
1	a
2	b
3
4
5
6	c

If I want to read rows 2-4, I expect to get a dataframe with 3 rows. But read.xlsx(file, sheet, rows = 2:4, cols = 1, skipEmptyRows = FALSE) will return only the first row "b" because the next two rows are empty. The same thing happens if I request rows 2:5, but if I ask for rows 2:6 then I'll correctly get all rows.

I consider this a bug because I don't see this behaviour documented in read.xlsx , although perhaps it's documented elsewhere.

Ideally I'd think the behaviour should change so that you can always know the exact dimensions of the returned dataframe. IMO request 3 rows should always return 3 rows. But in order to not have a breaking change, a new parameter can be added to specify whether or not to ignore trailing missing values.

ycphs / openxlsx

read.xlsx() does not return the correct number of rows #304