kz26 / PyExcelerate

Accelerated Excel XLSX Writing Library for Python 2/3
https://pypi.org/project/PyExcelerate/
BSD 2-Clause "Simplified" License
529 stars 60 forks source link

Question about optimizing cell merge #180

Open winterfell2021 opened 2 years ago

winterfell2021 commented 2 years ago

Hi there, thanks for your great work! I got a large dataframe 30*10000 for example, and about 100 groups for each column to merge. Like

for i in range(30):
        column = n2a(i+1)
        for j in range(100000 // 100 - 1):
            worksheet.range(f"{column}{j*100 + 1}", f"{column}{(j+1)*100}").merge()

How can i speed up the merge function?

kevmo314 commented 2 years ago

I suspect the validation step is slowing your code down: https://github.com/kz26/PyExcelerate/blob/dev/pyexcelerate/Worksheet.py#L166-L168

You can get around it by doing something like:

worksheet._merges.append(worksheet.range(...))

Probably the proper way to do this would be to add a validate=False argument, we'd welcome a pull request for that.

winterfell2021 commented 2 years ago

Thanks your kind, quick reply. Got 44.35159114399994 seconds without validation and 712.614892569 seconds as normal! Thats insane! A pr will be made afterwards