protobi / js-xlsx

XLSX / XLSM / XLSB (Excel 2007+ Spreadsheet) / ODS parser and writer
http://oss.sheetjs.com/js-xlsx
Other
826 stars 416 forks source link

Add page breaks #13

Closed pietersv closed 8 years ago

pietersv commented 8 years ago

See https://github.com/SheetJS/js-xlsx/issues/128#issuecomment-158239470

pietersv commented 8 years ago

There are two aspects: * Adding row breaks to the xlsx file * Specifying row breaks in the CSF workbook object

@schuetta92 Any suggestions how we should specify page breaks in the Javascript API?

In the very short term, you can add them ad-hoc as follows. Page breaks are described by <rowBreaks> which appears just before <extLst>. The "id" attribute is the row before the break, the "max" attribute is the last row before the next break.

  <rowBreaks count="1" manualBreakCount="1">
    <brk id="8" max="16383" man="1"/>
  </rowBreaks>

Below is code for Node.js, I believe it's possible to rewrite this for the browser too.

var fs = require('fs');
var jszip = require('jszip');
var cheerio = require('cheerio');

var FILENAME = __dirname + '/wb.xlsx';
var OUTFILE = '/tmp/wb.xlsx';

var zip = fs.readFile(FILENAME, function (err, data) {
    if (err) throw err;
    var zip = new jszip(data);
    var sheet = zip.file('xl/worksheets/sheet1.xml').asText();

    var $sheet = cheerio.load(sheet, {xmlMode: true});
var rowBreaks = $sheet('rowbreaks');
if (rowBreaks.length === 0) {
    $extLst = $sheet('extLst');
  // id is the row before the break
    $extLst.before('<rowBreaks count="1" manualBreakCount="1"><brk id="8" max="16383" man="1"/></rowBreaks>')
    }
else {
  // add new <brk> records to
  // update counts in the <rowbreakss> record
    }

zip.file('xl/worksheets/sheet1.xml', $sheet.xml())
var buffer = zip.generate({type: "nodebuffer"})
fs.writeFile(OUTFILE, buffer, function (err) {
    if (err) throw err;
    console.log("open " + OUTFILE)
    console.log("Select View -> Page Layout to see breaks")
    });
});
pietersv commented 8 years ago

I have another reason to add page breaks, so scheduling this now. Would welcome advice on a design question.

Page breaks, grid lines, tab selections, etc. are sheet properties. The Common Spreadsheet Format currently doesn't have a SheetProperties object. Rather, it uses attributes prefixed with !, e.g. wb['!ref'], wb['!cols'] and wb['!merges']

So can think of two ways to add page breaks, grid lines and potentially other sheet properties:

The first approach has the advantage of paralleling what's already there. The second seems more object oriented. I'm not yet sure if new properties with ! are silently ignored if not handled or cause errors.

pietersv commented 8 years ago

Added features for breaks, gridlines, scale and properties. Examples below.

If you want to force a break early, use rowBreaks and colBreaks. If you want to keep it from breaking, need to set the scale to some number smaller than '100'.

Note that the scale is a string and showGridLines has a capital L.

wb.Sheets[sheetName]['!rowBreaks'] = [16,32];
wb.Sheets[sheetName]['!colBreaks'] = [8,16];
wb.Sheets[sheetName]['!pageSetup'] = { scale: '90'};
var defaultCellStyle =  ;
var wopts = { 
   bookType:'xlsx', 
   bookSST:false, type:'binary', 
   defaultCellStyle: { font: { name: "Verdana", sz: 11, color: "FF00FF88"}, fill: {fgColor: {rgb: "FFFFAA00"}}}, 
   showGridLines: false, 
   Props: {
        title: "Goldilocks",
        description: "Girl discovers mysterious cottage in forest ",
        creator:"Traditional folk tale",
        keywords: "Bears; porridge; property rights; iterative optimization",
        subject: "subject"
      }
};