Open Phrogz opened 3 days ago
The first row takes ~0.4ms. The last row takes ~31ms. The slowdown is linear across the number of rows.
Using iter_rows()
is significantly faster:
print(sheet.max_row) #=> 1015
print(WorkSheetParser.parse_calls) #=> 0
t0 = time.perf_counter()
[row[0] for row in sheet.iter_rows()]
print(time.perf_counter() - t0) #=> 0.0368
print(WorkSheetParser.parse_calls) #=> 14535
Interestingly:
read_only=False
this test gets 6x faster.read_only=True
, Worksheet.iter_cols()
does not exist.
Looping through 1000 rows and finding the first cell in each row in a read-only workbook takes 13s and calls
openpyxl.worksheet._reader.WorkSheetParser.parse_cell()
over 7 million times.I have a ~large workbook with some ~large sheets. The .xlsx is 583kB, there are 9 worksheets:
The only change I made to the library was this change in
_reader.py
: