Closed mjdb3d closed 7 years ago
Can you share a file?
If you are reading from a file, the formatted text should be in the w
field of the cell (the v
field stores the raw numeric value)
#include "shim.js";
#include "jszip.js";
var JSZip = app.JSZip;
#include "xlsx.flow.js";
var jsx_path = File($.fileName).parent+'/';
var infile = File(jsx_path + 'file.xlsx');
infile.open("r");
infile.encoding = "binary";
var data = infile.read();
var workbook = XLSX.read(data, {type:"binary"});
var sheet_name = workbook.SheetNames[0];
var worksheet = workbook.Sheets[sheet_name];
var data = XLSX.utils.sheet_to_json(worksheet, {header:1});
$.writeln(data);
result: 0.01,0.22,-0,-0,0.25,-0,0.224,0.132,0.1755,0.704,,,,,,,,,,,,,,,,,,,,
@mjdb3d The extendscript support is relatively new and we haven't stress tested every feature. Thanks for reporting, we will look into this
@SheetJSDev you’re welcome. I hope it will be fixed soon. Thank you.
This seems to be part of a larger problem (with apparently all excel readers) where the type information is available (I've read the source code), but for some unknown reason the data is abstracted away from the end-user and ignored for output. For example, for this spreadsheet in xlsx:
$40, 12%
I'm expecting one of these 2 outputs from sheet_to_json
:
{[
"$40",
"12%",
]}
// or:
{[
{ type: "CURRENCY", value: "40" },
{ type: "PERCENT", value: "12" },
]}
The actual result is:
{[
"40",
"0.12",
]}
This also results in data-loss for sheet_to_csv
.
Here we are checking the primary type t
:
switch(val.t){
case 'e': continue;
case 's': break;
case 'b': case 'n': break;
default: throw 'unrecognized type ' + val.t;
}
but there is not any checking of the secondary type 's' (currency, percent, etc..)
@InstanceOfMichael can you share a file? I just threw together a sample file:
The CSV and JSON does exactly what's expected. This is from http://oss.sheetjs.com/js-xlsx/ (just drag and drop into the box):
This is from node:
> var XLSX = require('xlsx');
> var wb = XLSX.readFile('608.xlsx');
> XLSX.utils.sheet_to_csv(wb.Sheets.Sheet1);
'$40 ,12%\n' <-- formatted text
> XLSX.utils.sheet_to_json(wb.Sheets.Sheet1, {header:1})
[ [ '$40 ', '12%' ] ] <-- formatted text
> XLSX.utils.sheet_to_json(wb.Sheets.Sheet1, {header:1, raw:true});
[ [ 40, 0.12 ] ] <-- raw numbers
The problem covered in this issue is extendscript not playing nice with the formatting functions. We've run into issues in the past with regular expressions, for example https://github.com/davidchambers/Base64.js/issues/31#issue-216606310 , but if you are using the web browser or node you should get the correct formatted data or raw data.
using { raw: false }
, I am now getting this from my imaginary spreadsheet:
{[
"40",
"12%",
]}
Which is good enough for my purposes. Thank you!
That still looks wrong. Can you share the worksheet object (JSON.stringify(worksheet)
)? If the cell has formatted text (the cell object for 40
has a w
key with value "$40 "
) you should see the same text value in the output.
@InstanceOfMichael thanks for sharing! I can reproduce, it's a bug in the number formatter.
At a high level, Excel doesn't distinguish between "currency" and "percent" and "date" and other cell types. There is a number cell type, and the number format dictates the interpretation.
If you want to see the number format, you can set cellNF:true
in the options object. Then every cell will have a z
key corresponding to the raw number format. If you look at the end of the output you'll see:
"164":"General",
"165":"[$$-409]#,##0.00;[RED]\\-[$$-409]#,##0.00",
"166":"0.00%"
Those are the formats that LibreOffice stored. Normally Excel would store a format like $#,##0.00
for the currency.
http://oss.sheetjs.com/ssf/ lets you test the formats:
using Excel's normal currency:
Using the LO currency format:
@mjdb3d @firas3d thanks for reporting! Finally figured out the issue, and it's a really really strange ExtendScript bug! Consider the following expression:
1 && 2 || 3 && 4 && 5 && 6
In JS this is interpreted as
(1 && 2) || (((3 && 4) && 5) && 6) // --> 2
In ExtendScript Toolkit (you can test in the JavaScript console) and Photoshop/Illustrator the expression is interpreted as
((((1 && 2) || 3) && 4) && 5) && 6 // --> 6
The formatter library uses a complex expression of the same form:
o = c; while(++i < fmt.length && "0#?.,E+-%".indexOf(c=fmt.charAt(i)) > -1 || c=='\\' && fmt.charAt(i+1) == "-" && i < fmt.length - 2 && "0#".indexOf(fmt.charAt(i+2))>-1) o += c;
Since we use similar expressions throughout the codebases we have to make some more changes, but based on some simple tests I'm fairly confident that is the origin of the problem
@SheetJSDev Still getting the same issue in the 0.15.6
version. Any idea what to do? @mjdb3d @InstanceOfMichael
Hi @praveen-wal @mjdb3d @InstanceOfMichael , have you got any solution for this?
Hi, i'm still getting an error while trying to export as percentage. It's adding two extra zeros as tested on this tool:
For example, if you write any number as Percentage like 17 you get 1700%. Any help?
Thanks
The percentage seems to be preserved only if the raw is set to false.
Is it possible to get data from (xlsx) cell that formatted as percentage ? I've tried and I got numbers divided by 100.
xlsx file 12.23% readed data 0.12 Result required 12.23 or 12.23%
thanks in advance.