kz26 / PyExcelerate

Accelerated Excel XLSX Writing Library for Python 2/3
https://pypi.org/project/PyExcelerate/
BSD 2-Clause "Simplified" License
530 stars 60 forks source link

MemoryError while generating large Excel file #56

Closed tarungarg546 closed 7 years ago

tarungarg546 commented 7 years ago

Hey, First of all, thanks for this awesome projects. It's people like you that keep this wave going.

I am trying to profile PyExcelerate by creating an Excel file with 500k rows and 50 cols but suddenly it threw me a MemoryError:-

My Code :-

def evaluate_pyexcelerate(target):
    two_d_list = []
    headers = []
    for i in range(50):
        headers.append("Col {0}".format(i))
    two_d_list.append(headers)
    wb = pyexcelerate.Workbook()
    for row in range(500000):
        values = []
        for index, _  in enumerate(headers):
            values.append("Col + Row {0}".format(index + row + 2))
        two_d_list.append(values)
    ws = wb.new_sheet("sheet 1", data=two_d_list)
    wb.save(target)

start_time_pyexcelerate = time.time()
evaluate_pyexcelerate("pyexcelerate_a1_500k.xlsx")
print("%s seconds on PyExcelerate" % (time.time() - start_time_pyexcelerate))

Error Traceaback :-

Traceback (most recent call last):
  File "c:/Users/Tarun_19/Desktop/script.py", line 21, in <module>
    evaluate_pyexcelerate("pyexcelerate_a1_500k.xlsx")
  File "c:/Users/Tarun_19/Desktop/script.py", line 17, in evaluate_pyexcelerate
    ws = wb.new_sheet("sheet 1", data=two_d_list)
  File "c:\python27\lib\site-packages\pyexcelerate\Workbook.py", line 24, in new_sheet
    worksheet = Worksheet.Worksheet(sheet_name, self, data, force_name)
  File "c:\python27\lib\site-packages\pyexcelerate\Worksheet.py", line 33, in __init__
    self._cells[x][y] = cell
MemoryError

Also, it is worth mentioning that i did the same thing some time back and it worked perfectly.

kevmo314 commented 7 years ago

Thanks for the report. Just curious, how much memory does your machine have? I was able to run this without issue, however it did use quite a bit of memory. That being said, this did give me an idea for an optimization for dense data sets like this one, which I can investigate later this week when I have some time.

kevmo314 commented 7 years ago

I've addressed this in commit eeb97b4368f4b46f5c442d2f8a2ae15852931694 which should reduce memory usage by about 30% for this case.