tfussell / xlnt

:bar_chart: Cross-platform user-friendly xlsx library for C++11+
Other
1.49k stars 418 forks source link

How to continue, as the project seems to be unmaintained? #748

Open m7913d opened 3 months ago

m7913d commented 3 months ago

Sadly, this project seems to be unmaintained by now. The last commit is from 2022. Issues and PRs continues to grow.

@tfussell has done a great job creating and maintaining this repo for many years. I hope he does well.

To keep the project healthy, it might be a good idea to fork the repo into a new github organisation, so that the community can maintain this library.

Who is interested in joining this effort?

If others are interested in joining this effort or you have any other ideas on how to continue this great library, please let us know.

doomlaur commented 3 months ago

First of all, a huge thank you @tfussell for creating and maintaining this library for a long time. He has done an amazing job creating this library, which was a huge help for the C++ community.

@m7913d I would love to join this effort, yes. However, I'm not very comfortable in joining this effort as the main maintainer. This is because I'm using XLNT in a research project that is going to end in a few months - probably at the end of this year. I can continue reviewing pull requests and test certain use cases, however, being the main maintainer is a responsibility I cannot take if I'm not going to use the library anymore. In other words, while I can take my time to work on XLNT over the next couple of months, I cannot promise to keep the same pace in the future if the software I'm working on is going to be discontinued (while I can always test simple use cases by writing a few lines of code independently of the project I'm working on, complex formats such as XLSX require a large amount of testing that goes above and beyond simple use cases containing a few lines of code).

I also want to mention that I talked privately with @flaviu22 about the maintenance issue. He contacted me because I have one of the most active forks. He also mentioned that we could work together on some improvements and new features. He was also interested in implementing support for the older binary XLS format, which was already discussed in the issues #731 and #227. Maybe he would also like to join the effort of maintaining XLNT.

By the way, in addition to the issue #644 discussing the next steps necessary to release XLNT 1.6, I also have a few ideas about what can be improved in the future. I talked with @flaviu22 privately about this, but if you are interested to know my opinion on the future of XLNT, please let me know.

m7913d commented 2 months ago

@doomlaur Thank you for your interest in joining the effort. I'm definitely interested in hearing your opinion on the future of this project, but my main focus would be on bug fixes (and performance optimizations) rather than new features.

doomlaur commented 2 months ago

@m7913d My main focus would be the same :smile: The following list by far does not contain everything, but here are some points that could be improved in the future:

  1. Better conformance to the ECMA-376 specification. There are multiple examples where XLNT does not support all features of the Office Open XML specification (which is understandable - I'm not criticizing), however, in such cases XLNT (or its underlying XML parser, libstudxml) throws an exception instead of ignoring the data - for example, issues #685 and #735 come to mind.
  2. Performance should be improved. I'm working on a software that can handle large amounts of data, so we need to use XLNT's streaming_workbook_reader for that purpose. However, when loading an XLSX file containing 95 MB of data, XLNT is roughly 10 times slower than Excel, also shown by the performance profiler. Take this number with a grain of salt, though - I did not benchmark it outside of our software using a simple example (not yet, at least), but either way there's a large performance difference. The older issue #648 discusses some points that can be improved.
  3. There are major issues with memory consumption when loading large files. Before our software used the streaming_workbook_reader, we used the simpler xlnt::workbook - however, I remember that loading files that contained over 15 MB of data used over 20 GB (!) of memory allocated by XLNT. There are also multiple issues about this, like #370, #403 and #522.
  4. XLNT cannot load files that have been saved in the Strict Open XML Spreadsheet format - a format that is supported by Office 2013 and newer versions. See issue #515.
  5. While XLNT can load encrypted files that have been protected by a password, using it with the streaming_workbook_reader is currently not possible (which is made even worse by the memory consumption issues I explained above) - see issue #180. In other words, users of this library must currently choose between good memory consumption and the ability to open password-protected documents, but cannot use both at the same time.
  6. XLNT currently cannot open XLS files (the old binary format for Excel), which was the standard before Office 2007 and is still sometimes used nowadays. See issue #227 where @tfussell mentions that it probably shouldn't be a huge amount of work due to the way XLNT already works. @flaviu22 wrote me that he would be interested to implement this.
m7913d commented 2 months ago

@doomlaur Many good ideas to improve XLNT. However, I don't have experience with the streaming interface. My main XLNT use case is exporting data to XLSX. So, I will probably not take the lead in improving the streaming interface or XLS support, but I'm happy to support you, @flaviu22 or others where I can.

To get started, I think we should first setup the repo we will use for the development:

Setting up an active xlnt repo would at least avoid further fragmentation of the xlnt development.

doomlaur commented 2 months ago

@m7913d Since I have exactly the opposite use case (importing data from XLSX, but never exporting), I think we could complement each other very well :wink: By the way, not sure if this is an important use case for you, but since you mention exporting, according to the feature list, writing/exporting files encrypted by a password is not supported at all, as also explained in #151, #231 and #373.

I think a GitHub organization is a good idea, yes - especially because maintainers might become more or less active over time, so having multiple maintainers decreases the probability that the project will die.

Either forking this repository or mine would work. In principle, nowadays, the only difference between my fork and this repository is that I merged some important fixes from #686, #688 and #736. Otherwise, all my previous fixes have already been merged by @tfussell a few years ago. Maybe I would recommend forking this repository and merging the pull requests again, as this repository contains many other issues and pull requests - in other words, just for documentation purposes - but I don't mind either way :smile:

m7913d commented 2 months ago

The new github location to further develop XLNT by the community is: https://github.com/xlnt-community/xlnt

Feel free to join our effort to keep the good work of tfussell alive.