catamphetamine / read-excel-file

Read *.xlsx files in a browser or Node.js. Parse to JSON with a strict schema.
https://catamphetamine.gitlab.io/read-excel-file/
MIT License
301 stars 52 forks source link

sheet naming issue temp fix #95

Closed sabbiu closed 3 years ago

sabbiu commented 3 years ago

I found an excel file where fileNames.sheets[sheetRelationId] = /xl/worksheets/sheet1.xml image

Content file image

catamphetamine commented 3 years ago

Hmm. Can you tell more about the file. Do you know what software was used to generate it?

sabbiu commented 3 years ago

The file that I was provided, unfortunately cannot be shared, because it contains proprietary information. I am trying to replicate the issue by replacing it with test data, and I am not successful to replicate the issue for now.

I will add further information as soon as I find someway of replicating the issue.

catamphetamine commented 3 years ago

The file that I was provided, unfortunately cannot be shared, because it contains proprietary information. I am trying to replicate the issue by replacing it with test data, and I am not successful to replicate the issue for now.

No need to share, just ask them what software did they use for generating the file.

sabbiu commented 3 years ago

Sure, I'll get back to you regarding that.

catamphetamine commented 3 years ago

So far, I'm not convinced that a workaround like this should exist. I'd blame it on the software that was used to generate the file, unless it's something popular which it doesn't seem like it is. Perhaps it was generated programmatically by some other library or a home-written script.

sabbiu commented 3 years ago

AFAIK, it was generated using software. I was also able to import this file in google sheets, and it was working fine. But I will comment as soon I am provided with more details.

catamphetamine commented 3 years ago

You could also see if some other libraries open such files, like "sheet js", for example. https://sheetjs.com/demo

sabbiu commented 3 years ago

It works with sheetjs as well.

catamphetamine commented 3 years ago

See if this code change works:


function parseFileNames(content, xml) {
  const document = ...
  const fileNames = ...

    const filePath = relationship.getAttribute('Target')
    // There has been one weird case when file path started with a `/`.
    // https://github.com/catamphetamine/read-excel-file/pull/95
      .replace(/^\//, '')

  ...
}

in readXlsx.js

sabbiu commented 3 years ago

It works, but I had to make changes here,

https://github.com/catamphetamine/read-excel-file/blob/master/source/read/readXlsx.js#L583

const filePath = relationship.getAttribute('Target').replace(/^\/xl\//, '')
catamphetamine commented 3 years ago

That would be strange because the issue you're having seems to be a leading slash that shouldn't be there. Then why're you also stripping the xl part.

sabbiu commented 3 years ago

As you can see here,

https://github.com/catamphetamine/read-excel-file/blob/master/source/read/readXlsx.js#L87

It is prepending the xl part, and thus I have to remove it during parsing :sweat_smile:

catamphetamine commented 3 years ago

published read-excel-file@5.2.7 https://gitlab.com/catamphetamine/read-excel-file/-/commit/f1cfbe40d1ae80696e5da1c6253373383e836349