SheetJS / sheetjs

📗 SheetJS Spreadsheet Data Toolkit -- New home https://git.sheetjs.com/SheetJS/sheetjs
https://sheetjs.com/
Apache License 2.0
35k stars 8k forks source link

Writing Style information into Excel #128

Closed nukulb closed 6 years ago

nukulb commented 9 years ago

I am trying to preserve the style information while I read and then write an excel File. Seems like the style information is read when I read the file, but unfortunately when I write the file its not preserved.

Thoughts on how I can read and write the style information?

            excelFile = XLSX.readFile(file.path, {
                cellStyles: true
            });

            // some processing here!

            XLSX.writeFile(excelFile, fileName, {
                cellStyles: true
            });

@sheetjs After a conversation with @elad (https://github.com/hubba/js-xlsx/commit/5e9bca78f2b0c54242cefc8a358f3151232941ab#commitcomment-8039333) I was informed that the styling information is only read, and not written back to excel.

SheetJSDev commented 9 years ago

@nukulb @elad @hubba After exploring this a bit more, I have a better sense for the roadmap. This is going to be a much larger discussion, but one that we've been putting off for far too long:

Even though we all know this, it should be reiterated: our ultimate goal is to devise a universal representation (referred to as the "Common Spreadsheet Format" in the READMEs) that works for all spreadsheet formats, not just XLSX. If you limit yourself to XLSX support, it's really easy to just persist everything, but there are really 4 different systems you must consider:

Currently the code is spread across two libraries: js-xls and js-xlsx (which were developed separately due to licensing concerns that were eventually resolved)

The last N times this conversation came up, the discussion fizzled because the style information is highly linked and requires careful manipulation to preserve integrity across formats.

Pretty much every imaginable feature is persisted in at least 4 different ways (XLSX, XLSB, XLS, XLML). All formats use references to minimize data size. What is the best way to expose these features to the end users?

APIs

Originally, SheetJS was developed to solve a simple problem: "Can we reliably extract the data from an Excel file?" It's a relatively straightforward problem, and none of the hidden style complexities were relevant.

Since that point, the scope expanded quite a bit as it became apparent that we could do much more. The underlying approach allowed for all kinds of cool developments (since we had a unified format, we could build one xlsx writer that "just worked" with XLS and other inputs). But now the downsides are becoming clear.

Should we have an API? If so, what does it look like?

If we continue with the direct object manipulation approach, there are a few ways to go:

1) C-like "pointer" manipulation: storing numbers to be used as references in other structures.

2) Data duplication: store styles in many places, push the complexity to the writing functions

Which is preferable?

@nathanathan @amoki @mchapman @corsaronero @artemryzhov @nvcken @m1sta @kingjt @johnyesberg @ulknight @mgcrea @bolemeus @diginfo @sivyr @bmavity @djmax @christocracy @gcoonrod @clayzermk1 @jokerslab : Since you were involved in a discussion related to style, I'd like to hear your opinions on the matter (kinda sad to see so many names on this list). To focus the discussion:

A) should we build an easy-to-manipulate structure or introduce a series of functions to manipulate the object?

B) Should we be working at the attribute level (e.g. set_cell_bold or cell.bold) or use a bitfield (set_cell_style or cell.style)?

nvcken commented 9 years ago

has anyone who know Aspose.Cells? I think we can reference from it about api structure style Just a idea

m1sta commented 9 years ago

In the long run I like the easy-to-manipulate structure approach. Create a very flexible abstract representation of a sheet and have readers/writers contain an approach which maps file formats to this abstract representation. When it comes to the abstract representation I definitley prefer cell.style. Seems more easily serializable. Everyone loves to be able to JSON.stringify.

From a practical perspective I'd most like to be able to take a template based approach. Maintaining a separate template for xlsx, xls, xlsb, and csv isn't a huge issue from a dev perspective. Having a simple consistent api to read, clone, and subtly modify worksheet, range, row, column, and cell objects within a loaded template, regardless of format, is what I was really hoping to see. This means functions that in the short term I'd prefer to see effort put into modifying existing objects.

elad commented 9 years ago

I pretty much agree with @m1sta:

I also like the template approach, but I'm a little concerned about binary change when reading and writing the same file. Excel stores some style data as themes and references internal indices of colors and "taint," and I'm not entirely sure other (non-XLS) formats do the same. If we don't care about binary change - that is, we don't mind that the original file says "the color is aqua with a 40% taint" and the written file says "the color is some RGB value" - then I see no issue with going the template route.

m1sta commented 9 years ago

What if all of the template related data was placed in an 'extended' property on each key item?

workbook: {props: {}, references: {}, extended: {}, 
   sheets: [0: {props: {}, ranges: {}, extended: {}, 
      rows: [0: {props: {}, extended: {}, 
         cells: [0: {value: 123, formula: null, 
            props: {style: {foreground: [{color: "#0055dd"}]}}, 
            extended: {xlsx: "the color is aqua with a 40% taint", xlsb: {}}

... or something similar, such that the original "the color is aqua with a 40% taint" data is re-used if the extended data exists for the file format being written?

elad commented 9 years ago

Just to make sure I understand, do you mean keep the portable (for lack of better word) representation in props and retain original format-specific representation in extended so that when writing (back?) to that format, the original values can be used?

m1sta commented 9 years ago

Yep.

elad commented 9 years ago

Sounds good to me, especially if we add a toggle to disable it and rely strictly on the portable representation since I recall @SheetJSDev had concerns about introducing cell-level properties due to object size issues (but I might be remembering it wrong).

m1sta commented 9 years ago

A 'write priority' key might be handy so that you can dictate whether to use props or extended (if available), and some kind of 'extended data index' to minimise object verbosity might be good too.

elad commented 9 years ago

I agree.

m1sta commented 9 years ago

Another interesting question for me is whether to separate format information (and other properties) from value information. This might allow for a more efficient data structure.

Store the formatting for the column once, indicate which cells it relates to, store the values as a dense array. Might need an additional, optional, 'compress' step?

That'd mean something more akin to...

values: [["First header", "Second header"], [123, 456]]
meta: {generic:{}, xlsx:{}, xlsb:{}}

... in memory, with some smarter support functions.

elad commented 9 years ago

I think this is or similar to what @SheetJSDev referred to as the C-like "pointer" approach.

I'm not yet sure I prefer it over data duplication, which is a lot more natural pick for me if we go the easy-to-manipulate structure route. It might be though that I don't understand what you mean by a "compress" step. If we go the separate value/style storage way, what would adding a cell with style look like? what would modifying a cell's style look like?

SheetJSDev commented 9 years ago

Regarding the size issue: In the web browser, you can either do all of the heavy lifting in the main execution thread or with a Web Worker (http://dev.w3.org/html5/workers/). Objects cannot be shared between workers and the main thread, so the worker stringifies the intermediate object and the main thread parses it:

Reducing the stringified object size allows web workers to handle larger files.

Regarding values/metadata: As discussed in https://github.com/SheetJS/js-xlsx/issues/126, it's not always possible to map between Excel and JS data types easily. For rich text formatting see https://github.com/SheetJS/js-xlsx/issues/74 -- we still need to find an acceptable form.

@elad FYI The theme tint does not exist in the XLS format

KingJT commented 9 years ago

As I see it, API definitely would be easier to document than the JS object representation

elad commented 9 years ago

In #75 I proposed a style object:

{
    bold: true,
    font: 'Arial',
    size: 16,
    fg_color: '#000000',
    bg_color: '#ffffff',
    ...
}

And you (@SheetJSDev) pointed out to XLSB's "text run" concept.

Thinking out loud...

Let's also introduce a text run array:

[
    { f: from_offset, t: to_offset, s: style_object },
    ...
]

Then cell.s stays the style object and cell.x will be the text run array.

Without taking optimization into account just yet, this so far seems reasonable to me.

So, two questions:

mgcrea commented 9 years ago

Also agree with @m1sta regarding the API.

For cell.style, instead of inventing a new object, we should probably directly go for the official CSS spec, with a jQuery-like API (obviously with a strict subset support).

Something like:

myCell.css('background-color', 'red').css('color', 'black').css('font-size', '12px');
myCellB.style = myCell.style;
myCellC.css('color', myCell.css('color'));

For the future, I'd love to be able to easily export the worksheet structure to HTML/PDF along XLS*. I can imagine having some pre-defined (a bit like markdown styles) for our worksheets (GitHub spreadsheet, etc.).

m1sta commented 9 years ago

+1 for CSS as the basis for the abstract styling model

DigitalMachinist commented 9 years ago

Agreed. +1 for CSS here as well. If you're going to abstract away the details of particular file formats and go with something uniform, CSS really should be your model for this. There's hardly anything else that universal to draw from.

On Mon, Oct 6, 2014 at 9:52 AM, m1sta notifications@github.com wrote:

+1 for CSS as the basis for the abstract styling model

— Reply to this email directly or view it on GitHub https://github.com/SheetJS/js-xlsx/issues/128#issuecomment-58019484.

elad commented 9 years ago

I agree with using CSS as the model for the API as well. I'm not sure .css is a good name for the function though (is it misleading?), maybe .style is a better name if we store the actual object in .s. No strong preference either way.

How should we handle text runs? Styling can be applied to parts of a cell's text. For the setter extra parameters might work (cell.css(attr, value, [start_offset], [end_offset])), but the getter expects only one parameter. Should there instead be a .css_run function? Any other ideas?

SheetJSDev commented 9 years ago

Keep in mind Excel has support for double underline, and AFAICT there is no direct CSS equivalent.

For reference, the rich text types are enumerated in the test file https://github.com/SheetJS/test_files/blob/master/rich_text_stress.xlsx

elad commented 9 years ago

I opened rich_text_stress.xlsx and noticed that it doesn't show any style for outline and shadow at least in Excel 2010. I also tried saving it as a web page and Excel notified me some features aren't compatible with the format, and sure enough there was no outline, shadow, double, or accounting underline styling. By the way, Preview on Mac OS X doesn't show subscript, superscript, outline, shadow, double, or accounting underline styling.

Maybe we should enumerate all of the style properties a cell can have before deciding how to represent and access them. Here's a quick table mapping some Excel styles to CSS, I will update it as necessary.

Style CSS
Normal - / font-weight: normal;
Bold font-weight: bold;
Italic font-style: italic;
Underline text-decoration: underline; text-underline-style: single; (?)
Size font-size: <size>;
Strike text-decoration: line-through;
Subscript vertical-align: sub; font-size: smaller;
Superscript vertical-align: super; font-size: smaller;
Outline text-effect: outline; (?)
Shadow text-shadow: auto; (?)
Double underling text-decoration: underline; text-underline-style: double; (?)
Accounting underline text-decoration: underline; text-underline-style: single-accounting; (?)
Doubt accounting underline text-decoration: underline; text-underline-style: double-accounting; (?)
Background color background-color: <color>;
Foreground color color: <color>;
Pattern ?
SheetJSDev commented 9 years ago

I made that file in Excel 2011 and its possible that the windows versions do not support the missing forms.

Oddly, iOS numbers actually renders them: http://i.imgur.com/YsCJNgb.jpg

IIRC excel actually uses a CSS attribute like text-decoration for the features not supported in the browser (with nonstandard values). We could just replicate that

As far as text runs are concerned, we could mirror the VBA interface as I noted in https://github.com/SheetJS/js-xlsx/issues/75#issuecomment-49314941

Sheets("Sheet1").Range("B3").Characters(16,9).Font.Italic = True

(Requesting a set of characters would return an object whose getters and setters work with the original text)

mgcrea commented 9 years ago

For the few styles that would not directly match the css spec we could use custom vendor prefixes:

text-decoration: -ms-xlsx-accounting-underline;

Like what we have to use today for flexbox:

display: -webkit-flex;
elad commented 9 years ago

I really don't like the idea of using custom vendor prefixes for internal style representation. :/

The goal as I understand it is to create a common format, that is, an internal format that can be externalized to XLSX, HTML, PDF, etc. If CSS isn't a (near-)perfect answer to the question of internal representation, then we're abusing it. By introducing vendor-specific prefixes into the internal representation I would argue we go against the "common format" concept.

Consider the object pollution you might get. How many vendors are there? Do we want five or six lines just to be able to add accounting underline? What if there are no equivalents for one or more vendors? It also doesn't solve the problem of an actual API. Instead of cell.s.underline = true; or cell.style('underline', true); you get:

cell.css({
    'text-decoration': '-ms-xlsx-accounting-underline'
    ...
});

So I'm again not entirely convinced CSS is what we want here for either API or internal representation.

mgcrea commented 9 years ago

@elad, vendor prefixes would only be used for xls(*) styles not easily matched by pure css props. I don't see where we would end up with multiple lines, there would only be one specific custom vendor -ms-xls. For the HTML export we could automatically convert theses "missing" CSS prefixed values to classes, that could be handled on the client/theme side (with CSS).

Anyway, that was just a quick idea, so it might not be the best thing to do.

elad commented 9 years ago

I see what you mean about multiple lines - if the internal representation agrees on a single vendor prefix then sure, we use that. A vendor-specific style ("accounting underline") is mapped to a vendor-specific CSS property ("-ms-xlsx-accounting-underline") invented to support it.

Still, I don't think using a CSS vendor prefix for internal common representation is a good design choice. I think the VBA API mentioned by @SheetJSDev is much cleaner and easier to use. Doesn't it also make more sense to mimic an API that already exists to work with Excel style properties rather than jQuery?

mgcrea commented 9 years ago

@elad being written in JavaScript, I'd say that this project's main audience is clearly web developers, that for the most part have some jQuery-like experience. VBA-fluent developers are getting very rare (at least in my Web/NodeJS area).

elad commented 9 years ago

That's a fair point, but I don't think jQuery's popularity should influence all APIs designed for the web and/or node.js. jQuery is a moving target and there's a constant back-and-forth of ideas and concepts, for example Ajax promises.

In any case, considering CSS doesn't offer a direct mapping to cell/text styles, requires vendor-specific properties and values, and in my opinion looks a lot clunkier than an existing API that does the same, I'm not convinced it's a good choice here for either internal representation or function interface.

Now letting others chime in and @SheetJSDev can decide. :)

SheetJSDev commented 9 years ago

@elad being written in JavaScript, I'd say that this project's main audience is clearly web developers

@mgcrea There are strategic reasons unrelated to the audience. Writing code to solve individual features is relatively straightforward. The hard part is finding real-world files and strange corner cases. The neat thing about a JavaScript and HTML5 solution is that pretty much anyone can try it on their files (there are no security issues because the files and data are never sent to a server, and there are no installation issues since no external plugins are required). If we started this in C or python or Java, people would either have to install something or send files to a remote server (and we would have far fewer testers). And of course, thanks to node, we can also write server processes and neat tools like the command-line "j".

I don't think jQuery's popularity should influence all APIs designed for the web and/or node.js

There is no real culture of JS in areas like scientific computing or data analysis, so we are starting from a blank slate. Since future projects may turn to our example, it's better to discuss now.

m1sta commented 9 years ago

With regard to formatting substrings, I agree with @elad (I think). Assume the property in question is an array, always. Very easy to [].join()

This feels like another situation where storing the values and the formatting in two separate but mirrored structures makes sense to me. In the values array we might see a cell represented as ["First", "Second"] and then in the formatting array we might see ["color: blue", "color: red"] or ["<span style='color:red'>{#}</span>", "<span style='color:red'>{#}</span>"]. You could just as easily assume direct CSS as the format if the first character in the format string isn't an <.

Also, I said CSS earlier, but I wonder whether we should be thinking LESS instead?

elad commented 9 years ago

Here are some useful links from Excel's API documentation:

In the example code posted @SheetJSDev posted:

Sheets("Sheet1").Range("B3").Characters(16,9).Font.Italic = True

The Range object is returned by Range("B3") and the Characters object is returned by Characters(16,9). The Range object contains both Font and Interior. The Characters object contains only Font. The Interior object seems to be cell-level styling and is where the pattern is kept. The Font object has the stuff we discussed above.

This feels like a very clean and simple API. Why do we insist on either CSS or LESS for this? :)

@SheetJSDev, I'm interested in your thoughts on this since I assume you're most familiar with the actual specification (not just Excel's implementation of it).

SheetJSDev commented 9 years ago

Finally in front of a computer :)

Using Excel 2011 with the rich_text_stress.xlsx file, copy the cells A1:B15 and take a peek at the clipboard (using https://www.npmjs.org/package/pb, run pb html). I've saved the content to a gist

@elad @mgcrea Excel encodes the text attributes as follows:

In fact, it appears that they use the CSS class names xl### where the number directly corresponds to style records and they use the vendor prefix mso-

@m1sta The major issue I have with storing the formatting information separately from the text information is that keeping a consistent structure seems to be messier than necessary. Consider the text "foobar" where "bar" is bolded. Now you go back and change the underlying text to "foobarbazquz". How does the style update? What happens if the style is inconsistent with the underlying text?

A direct translation of how XLS and XLSB store rich text would look like this:

var text_run = [
  { t: "foo", s: CELL_BOLD | CELL_ITALICS },
  { t: "bar", s: CELL_BOLD },
  { t: "baz", s: CELL_OUTLINE },
  { t: "qux", s: CELL_DOUBLE_UNDERLINE }
]

That way, the raw text would be text_run.map(function(x) { return x.t; }).join("").

@elad The Excel object model is pretty sensible and works neatly if you stick to assignment. However, getting properties isn't quite as neat. For example, consider the text "foobar" where "bar" is bolded but "foo" is not. Is the substring "ob" bold or not?

m1sta commented 9 years ago

@SheetJSDev Before the change I'd assume you'd have a structure like this...

cellData = ["foo", "bar"]
cellStyle = [null, "font-weight:bold"]

Assuming you're directly modifying the data structure, and not using the api, you'd probably end up with...

cellData = ["foobarbazquz"]
cellStyle = [null, "font-weight:bold"] 
//or maybe cellStyle = [null, ".cellBold"] if a class had been defined

The second item in the cellStyle array is now temporarily redundant. When the data structure is serialised you could have a flag to control whether redundant cell formatting was persisted or discarded. Alternatively you could...

cellData = ["foo","barbazquz"]
cellStyle = [null, "font-weight:bold"] 

The difference, I think, is very easy to grasp.

The best thing about this is that a new developer could learn to read/modify files lickity split, then learn about formatting later. I suspect parsing the structure would be faster too (at least in v8).

elad commented 9 years ago

@SheetJSDev some of those CSS styles seem non-standard and aren't supported by at least Chrome, Firefox, and Safari. In other words, we would be encouraging developers to use non-standard and vendor-specific CSS...

I agree that separating value and style might cause an unnecessary mess and is a lot harder to keep in sync than just style that's directly attached to objects.

@elad The Excel object model is pretty sensible and works neatly if you stick to assignment. However, getting properties isn't quite as neat. For example, consider the text "foobar" where "bar" is bolded but "foo" is not. Is the substring "ob" bold or not?

In Word and Excel this is determined by the first character in the selection. In your example, "ob" isn't bold because "o" isn't bold. If you select "ob" and hit ctrl-b, "ob" will become bold. If you do the other half - that is, "foobar" with "foo" bolded and "bar" not - then "ob" will be bold, and ctrl-b will make it normal again.

The algorithm could look like:

Likewise:

zelibobla commented 9 years ago

Guys thank you for your great work! Sorry for disturbing your discussion. I read several similar topics here about styling xlsx (missing it very much), but I didn't figure out are there any plans to implement such a functionality. And these plans exists are there any ideas when? :)

In my case the task is to keep original styles of template after some data put and saved.

elad commented 9 years ago

Style is one of two issue scopes (the other being dates) that keep coming up, so my guess is that it's definitely going to be addressed. @SheetJSDev hinted at it being worked on actively, so the best I could say is "soon."

I hope to make some time soon to work it as well. The only thing that's unclear to me is how to handle the stuff that's currently being written as style - it's a huge hardcoded string and some of it must be dynamic. Everything else (internal representation, writing the style back) is easy.

viteksafronov commented 9 years ago

I tried to understand everything you guys wrote here, but my question is simple: is it possible to set columns width in xlsx files I write?

DamianRodziewicz commented 9 years ago

Hi guys, any update on the style issue?

leefernandes commented 9 years ago

Is it possible to add cell styles to new sheets?

djfishe commented 9 years ago

Hi guys, I was wondering if there is any updates on this? Thanks.

bchr02 commented 9 years ago

:+1:

pietersv commented 9 years ago

Even a limited functionality to save some type of formatting (e.g. bold, italic, color, font size, fill, border) would be extremely useful, even if it didn't span the universe of possible input formats that might be obtained from parsing.

Cases like "foobar" with "ob" bolded seem like edge cases compared to the alternatives of no formats at all. This would even be useful if implemented in a bootleg fork of this library, providing a means of addressing the use case without incurring long-term promises.

DamianRodziewicz commented 9 years ago

+1

pietersv commented 9 years ago

There are three awesome Node.js Excel libraries which collectively offer key features, but choosing one is like playing "Rock-Paper-Scissors":

As a foundation for generating Excel documents, JS-XLSX seems like a great choice, as it has a very clean design philosophy, rigorous attention to licensing and IP, detailed testing and an active development community.

Would love to add the ability to write some minimal styling -- font weight, font color, fill color, merge, and borders -- and offer that as a pull request.

However, per the issue above there has a been a substantial bit of thought put to this already, and developing this is outside my focus.

Rather than take a proverbial gift horse and then ask for a gold [italic, Helvetica, bordered] bridle on it, my thought is to sponsor dev of this feature as a Freelancer project that yields a useful public contribution to this project.

Open to any suggestions.

SheetJSDev commented 9 years ago

@gradualstudent and everyone else: sorry for the apparent lack of public activity on this area (reading and writing cell styles, text styles, etc). A quick patch for specific features like bold and italics is pretty straightforward (https://github.com/SheetJS/js-xlsx/issues/74#issuecomment-46981775 gives a rough outline).

on a side note: there are a set of round-trip tests https://github.com/SheetJS/js-xlsx/blob/master/test.js#L696 which read a file, write it, read again and confirm that certain properties are the same. A similar set of tests should be written for styles and themes and other features.

pietersv commented 9 years ago

Aha, it seems very feasible to follow that comment in issue #74.

For a Common Spreadsheet Format, using CSS seems great (per @elad Oct 6 comment above) and inside the code translate these to Excel keywords per the Oct 7 @SheetJSDev comment above).

As a start, I put together a simple example case for testing using CSS labels for the .s attribute: https://gist.github.com/pietersv/d931cdf8cc9dc48919b4

pietersv commented 9 years ago

Have extended the library to allow fairly general formats, at https://github.com/protobi/js-xlsx. This allows authoring a new document with cell styles specified in the .s class as well as preserve styles read from XLSX.parseFile(...).

Also extended your helpful Gist at https://gist.github.com/SheetJSDev/88a3ca3533adf389d13c into a Workbook convenience class at https://github.com/protobi/workbook.

Example

var XLSX = require('xlsx');
var Workbook = require('./workbook')(XLSX);

var workbook = new Workbook()
    .addRowsToSheet("Main", [
      ["This is a merged cell"],
      [
        {"v": "Blank"},
        {"v": "Red", "s": {fill: { fgColor: { rgb: "FFFF0000"}}}},
        {"v": "Green", "s": {fill: { fgColor: { rgb: "FF00FF00"}}}},
        {"v": "Blue", "s": {fill: { fgColor: { rgb: "FF0000FF"}}}}
      ],
      [
        {"v": "Default"},
        {"v": "Arial", "s": {font: {name: "Arial", sz: 24}}},
        {"v": "Times New Roman", "s": {font: {name: "Times New Roman", sz: 16}}},
        {"v": "Courier New", "s": {font: {name: "Courier New", sz: 14}}}
      ],
      [
        0.618033989,
        {"v": 0.618033989},
        {"v": 0.618033989, "t": "n"},
        {"v": 0.618033989, "t": "n", "s": { "numFmt": "0.00%"}},
        {"v": 0.618033989, "t": "n", "s": { "numFmt": "0.00%"}, fill: { fgColor: { rgb: "FFFFCC00"}}},
        [(new Date()).toLocaleString()]
      ]
    ]).mergeCells("Main", {
      "s": {"c": 0, "r": 0 },
      "e": {"c": 2, "r": 0 }
    }).finalize();
XLSX.writeFile(workbook, '/tmp/wb.xlsx');

Read and write

It's also possible to read and re-write a document:

var wb = XLSX.readFile(__dirname + '/wb.xlsx', {cellStyles: true});
XLSX.writeFile(wb, __dirname + '/wb-1.xlsx');

Some info gets lost on parsing styles. It seems like only fill colors get expressed in the CellXf styles. Thus information about fonts, borders and number formats gets lost when reading. I recommend that we extend CSF such that cell styles have four attributes as below.

cell.s = {  "font": {}, "fill": {}, "numFmt": {}, border: {} }

I convert cell.s = { fill: cell.s } for backward compatibility with the current parseFile() (only if the style is an object, has patternStyle or fgColor and doesn't have font, border or numFmt attributes).

Known gaps

CSS vs JS

Now that i've gotten into it I can see the issues you've been discussing earlier. CSS seems like such a natural way to express cell styles. CSS is a style language. We'd like to abstract away from the details of OpenXML. Most of the cell styles have natural CSS equivalents. And Excel represents cell styles in way analogous to CSS classes to avoid redundancy.

Excel has some concepts not in CSS:

But the current parseFile already works pretty well and has a format that paralells Excel nicely. For now I recommend sticking with the current CSF language, complete supprt for reading font, border and numFmt attributes when parsing an existing element. We can later convert to CSS but that's bigger.

bchr02 commented 9 years ago

@pietersv this is great work! and the documentation is perfect! Thank you

SheetJSDev commented 9 years ago

@pietersv excellent work! (was it straightforward?)

The "fun" part comes when trying to work with the other formats, but I do see tremendous value in experimenting with an XLSX-specific version.

pietersv commented 9 years ago

Thanks! The SheetsJS code is elegant and concise which made it a real pleasure to map out and learn from.

There are minimal changes (under 10 lines) to xlsx.js to:

The major change is adding a new StyleBuilder class which reads the XLSX CSF object and exposes toXML() and getStyleIndex(format) methods. Right now it's in a separate file.

This tiny bit makes me appreciate what it must have taken to build this great library. Who knew that...

Agree this branch is experimental. Open to suggestions. Looking forward to using this in practice and addressing feedback, with aim of readying it for a pull request.