protobi / js-xlsx

XLSX / XLSM / XLSB (Excel 2007+ Spreadsheet) / ODS parser and writer
http://oss.sheetjs.com/js-xlsx
Other
826 stars 416 forks source link

Make separate branch? Planning dev strategy #21

Open pietersv opened 8 years ago

pietersv commented 8 years ago

Stepping back a bit, there are a few pending pull requests and issues related to styles that are pending. Would like to incorporate these, but wish to run it against the awesome test library first for safety.

Node 4 However, the library doesn't work on Node 4 and thus the tests fail. All else equal, I'd prefer to pull changes from the main trunk that address this before proceeding. Running multiple versions of node is an option but ...

Main branch More broadly, this branch isn't yet merged back to the main project. Have been holding off updating too much to both (a) preserve option of merging the master branch of this fork back into main trunk yet also (b) release https://github.com/protobi/js-xlsx#beta on npm and bower. Further, there are other pending topics unrelated to styles (e.g. browserify).

Net net, I wonder if it's better to make this a separate project, give it its own package name, version, etc.

Alternative An alternative approach I'm considering is to extend https://github.com/chuanyi/msexcel-builder so that it can can take CSF objects as input and/or style cells. That's a tiny light library. More often my goal is just to generate an XLSX with rich styling. That library only writes XLSX, whereas XLSX can read and write XLS, XLSX, XLSB. It might be good to make these two libraries mutually compatible.

eddyparkinson commented 7 years ago

github.com/SheetJS Abandoned?

npm says:

Don't squat on package names. Publish code or move out of the way.

https://docs.npmjs.com/misc/disputes I sent an email as it says in the npm post - so with luck - it will be possible to merge fixes into https://www.npmjs.com/package/xlsx j and harb

pietersv commented 7 years ago

@eddyparkinson Great question. Root branch is probably dormant. This branch is alive but on hold pending a project to merge into. I agree it’s time to select or create an active main branch and coalesce leading branches around that. And it seems that moment may have just now appeared, see https://github.com/SheetJS/js-xlsx/issues/369#issuecomment-265521591.

Root project The root project https://github.com/SheetJS/js-xlsx is a tremendous resource that can read and write XLSX, XLS, XLSB and XLSM files, and that it was made open sources is a valuable contribution. I believe that root project has been dormant for some time. Most salient issue is that it passes the extensive test suite only on Node 0.12 or earlier.

By any measure SheetJS has done a lot already. I don't know any of the details or even the team, but recognize that it’s difficult for the project owner or our team to make the business case for lots of continued development without incremental revenue on a complex project with so many users and use cases internationally.

This branch This branch was originally intended just to add writing more extensive styles. My aim was that it would be merged into the root. Reading styles was added only to enable complete round-trip unit testing comparable. The root project went dormant right around the time this branch was completed.

Restarting development It would be great to get development energy focused and a new project owner or team to lead this moving forward.

There are a lot of open issues on the root project. Because of its status, there are also many open non-style-related issues in this branch. And there are many other forks.

Priorities

  1. Get a fork of the root project that passes unit tests on Node 7 so that we can ensure quality. It appears such a branch may now exist... https://github.com/SheetJS/js-xlsx/issues/369#issuecomment-265521591.
  2. Get a good handle on the build, versioning, testing and release process. There’s a lot.
  3. Merge this styling branch with that branch, and ensure the extended test suite passes. This might be a bit of a headache as the original test suite had a couple wrinkles with styles.
  4. Review and merge in the many other pending pull requests. There are a lot of great ones. One challenge is a some contributions were made to the xlsx.js file rather than to the /bits/*.js. And some contributions came without unit tests.
  5. Scout out other active branches that address key improvements but never generated PRs.

Any volunteers particularly interested/handy in #1 and #2?

eddyparkinson commented 7 years ago

Too many versions problem

sheetjs say it is active but recommend people fork.

Too many versions - suggest selecting one version It looks like the many versions are causing a problem. It is hard to see how supporting more and more versions is workable. e.g. "2007 XLSX, 2010 XLSX, 2013 XLSX, 2016 XLSX" ... where to stop?

I suggest selecting one version, for the write - Something that is easy to test against. I would suggest Apache OpenOffice Calc because everyone has free access to it, thoughts?

Priorities

Sorry, I can put a day into this once in a while, but don't have a big block of time. I think we want a way to deal with this one pull request at a time.

eddyparkinson commented 7 years ago

sheetjs

tl;dr: you should fork and make your own modules.

We still use the modules as they stand every day in the context of a few server processes as well as a browser tool based on http://oss.sheetjs.com/js-xlsx/. Other people apparently use it too: xlsx, for example, shows 3599 downloads yesterday 21703 over the last week and 81971 over the last month according to https://www.npmjs.com/package/xlsx.

Making a change is not an easy task, especially on the write side, because the files have to work with many versions of excel as well as with third-party tools like LibreOffice which don’t even bother with full Excel compatibility. For example, https://bugs.documentfoundation.org/show_bug.cgi?id=83511 came up as a result of a user report and it effectively boiled down to LibreOffice conveniently deciding that it is appropriate to chop of the last few bits of every result. A math program decided that it was appropriate to fudge the numbers :( So you have a series of workarounds to address them and a series of seemingly nonsensical hacks to work around undocumented limitations such as Excel XML parsing logic.

To be absolutely clear about testing, we check some stuff manually against up to 13 versions of Excel, depending on the feature. For example, here is Excel 2.0: http://i.imgur.com/cy05pPz.png. It is an incredibly tedious and thankless task to slog through the relevant versions of Excel and test manually, but that’s what it takes. Third party tools generate files based on different versions of Excel, so a tool that reads files should be able to handle every format and a tool that writes files should try to generate files that are valid with as many versions as appropriate.

There is a myriad of ways to make small changes that work for some versions of excel but not others, or work for some versions of excel but not with other commonly-used tools. Even under the format of XLSX there are four main versions: 2007 XLSX, 2010 XLSX, 2013 XLSX, 2016 XLSX. New features that work in 2016 XLSX may not work as expected in older versions. So there’s a tradeoff, and while some developers throw caution to the wind we have decided to be a lot more conservative in changes.

If you would prefer breaking support for certain versions or formats, you are free to fork the open source modules and publish to NPM under different names. You could scope it e.g.

pasted from an email sheetjs sent me