IAMconsortium / pyam

Analysis & visualization of energy & climate scenarios
https://pyam-iamc.readthedocs.io/
Apache License 2.0
226 stars 118 forks source link

write multiple sheets with to_excel when df exceeds excel limit #522

Open byersiiasa opened 3 years ago

byersiiasa commented 3 years ago

Excel row limit is 1,048,576

gives error ValueError: This sheet is too large! Your sheet size is: i, j Max sheet size is: 1048576, 16384

Are there any good strategies for splitting up the rows? Most simply this could be just filling the sheets, but an idea (in general) could be to have an argument e.g. by= which could take model or scenario. Not sure what for useful, but some people might like it.

I also like print statement in between each sheet so you know it's in progress and can go get a coffee, but I think it's not everyone's cup of tea.....

danielhuppmann commented 3 years ago

One option to implement this would be the following

xl = pd.ExcelWriter(<file>)
for m in df.model:
    df.filter(model=m).to_excel(xl, sheet_name=f"data_{m}", include_meta=False)
df.export_meta(xl)
xl.close()

Problem is that this would still fail if one model has more than 1e6 rows, or if a model has a name containing \ / ? * [ ].

byersiiasa commented 3 years ago

probably good for 95% of cases, but maybe unneccessary