aVadim483 / fast-excel-writer

Lightweight and very fast XLSX Excel Spreadsheet Writer in PHP
MIT License
178 stars 31 forks source link

Changing fwrite to file_put_contents #92

Closed brainfoolong closed 4 weeks ago

brainfoolong commented 4 weeks ago

Hi.

I recently started using this library. It's awesome and very fast compared to PhpSpreadsheet.

However, file handling is somewhat buggy and leaks opened files.

I locally changed all uses of "fwrite" in FileWriter to file_put_contents, as this is a shorthand for fopen/fwrite/fclose and does not leak open files into memory.

You already use file_put_contents in the Writer.php. Should i commit a PR to change FileWriter to use this better function as well?

brainfoolong commented 4 weeks ago

Ok, nevermind. I decided to change more in the library and remove deprecations that we don't need. I'll manage my own private fork from now on.

I will privately cleanup the code, upgrade requirement to PHP 8.3 and fix a lot of __destruct issues as this destructor things write files when it is not needed anymore.

All those changes will break backward compatibility so i do not send you PR for this, unless you want all those huge refactoring changes, probably for a future V7... Let me know.

aVadim483 commented 4 weeks ago

fopen/fwrite functions are used because the main data writing mode is streaming. Now the file is opened, then fwrite is called many times, then the file is closed.

To use file_put_contents for large files, you will either have to accumulate all the data in memory to write the entire text in one call (then there will be a large memory consumption), or use the function with the FILE_APPEND flag. But then, when writing each line, the file will be opened, the pointer will be shifted to the end of the file, data will be written, and the file will be closed.

I haven't measured it, but I think it will work slower

aVadim483 commented 4 weeks ago

But everything I wrote above is about the entry in sheetN.xml, in XLSX files with a lot of data they are the largest. For other files that are packed into XLSX, maybe using file_put_contents will be a good practice

brainfoolong commented 4 weeks ago

Thx for the feedback. I already seen the benefit that the single fopen/multi write has, it's faster indeed. The 8k buffer is a good thing to also reduce fwrites without filling up memory to much. Very cool mechanism you used there.

For my case, the problem arised with the several __destruct()'ors as they try to write previously collected buffers (when excel is created but never saved for example) to temporary files when PHP shutdown and destruct the objects. But at that time, i already removed the custom temp folders and then it throws errors in shutdown handler, as the temp folder doesnt exist anymore.

Thx anyway for all your work, finally able to creat 200k rows excel files in PHP, which is just impossible with PhpSpreadsheet.

aVadim483 commented 4 weeks ago

By the way, the history of the library creation began with the fact that I needed to write 150K lines in Excel )))

But your remark about __destruct() is useful. Perhaps it requires refactoring