Key Error: data.ndc - Githubissues

rickmcgeer commented 1 week ago

I'm getting the following error from reading an NDA file generated by a backup from BTS_CLIENT_8.0.0:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python310\lib\site-packages\NewareNDA\NewareNDA.py", line 50, in read
    return read_ndax(file, software_cycle_number, cycle_mode)
  File "C:\Python310\lib\site-packages\NewareNDA\NewareNDAx.py", line 61, in read_ndax
    data_file = zf.extract('data.ndc', path=tmpdir)
  File "C:\Python310\lib\zipfile.py", line 1628, in extract
    return self._extract_member(member, path, pwd)
  File "C:\Python310\lib\zipfile.py", line 1667, in _extract_member
    member = self.getinfo(member)
  File "C:\Python310\lib\zipfile.py", line 1441, in getinfo
    raise KeyError(
KeyError: "There is no item named 'data.ndc' in the archive"

The file is a .ndax file generated as a backup, and this is happening with multiple files. Per the home page of the module, I'm filing this ticket, but honestly it could be a malformed file -- with this tester I really don't know. The file is big (> 49MB) and I can't generate a smaller file with the same behavior -- if you can provide troubleshooting pointers I'll be happy to give it a whirl. Thanks

rickmcgeer commented 1 week ago

I opened the file with zipfile.PyZipFile (per line 37 in read_ndax and found this:

>>> import zipfile
>>> zf = zipfile.PyZipFile("127.0.0.1_65_5_7_2818580302.ndax")
>>> zf.namelist()
['_rels/.rels', '[Content_Types].xml', 'docProps/app.xml', 'docProps/core.xml', 'docProps/custom.xml', 'xl/charts/chart1.xml', 'xl/charts/chart2.xml', 'xl/drawings/drawing1.xml', 'xl/drawings/_rels/drawing1.xml.rels', 'xl/sharedStrings.xml', 'xl/styles.xml', 'xl/workbook.xml', 'xl/_rels/workbook.xml.rels', 'xl/worksheets/sheet1.xml', 'xl/worksheets/sheet2.xml', 'xl/worksheets/_rels/sheet2.xml.rels', 'xl/worksheets/sheet3.xml', 'xl/worksheets/sheet4.xml']
>>>

d-cogswell commented 1 week ago

Hi @rickmcgeer that's a file structure that I haven't seen before. It looks like the data might be stored as XML instead of binary files (data.ndc). If you open these files in BTSDA, do you see all of the cycling data?

If you can share a file, I would be interested to look at it. You can either email a download link, or email as an attachment if the file size is not too big.

rickmcgeer commented 1 week ago

@d-cogswell thanks. I'm doing this for a company I work with and I'll ask them for permission to share some data. They (and I) are highly motivated, so I'll get back to you on this soon.

rickmcgeer commented 1 week ago

@d-cogswell I talked to my colleagues and we'll pick something to update with Friday. Meantime, if you have suggestions of other diagnostics I can run, I would very much like to.

d-cogswell commented 1 week ago

If you unzip the ndax, can you see your data in any of the xml files? You should be able to open them with a text editor or the python xml module.

rickmcgeer commented 1 week ago

I can, using ZipFile. The big items are xl\sheet{n}.xml , which appear to be verbose XML versions of the Info/Cycle/Statis/Detail sheets -- hard to tell because there are no header rows. So the read looks easy...but the headers for the columns we'll have to find

d-cogswell commented 1 week ago

Sounds good. In the meantime, I pushed a change to development that should raise a NotImplementedError for these files.

rickmcgeer commented 1 week ago

Unfortunately, my trip over there was delayed, but I have done some experimentation (I will try to get you a sample next week). These things are collections of XML files, each of which is given by the SpreadsheetML format: http://schemas.openxmlformats.org/spreadsheetml/2006/ which I think are the formats for xlsx files.

Grimler91 commented 5 days ago

Hi, from the names above it looks like the file is in xlsx format, please try opening it with excel (after renaming to .xlsx if necessary).

If I run the same commands on a xlsx file exported through bts client I get something very similar:

>>> import zipfile
>>> zf = zipfile.PyZipFile("file-exported-from-bts.xlsx")
>>> zf.namelist()
['_rels/.rels', '[Content_Types].xml', 'docProps/app.xml', 'docProps/core.xml', 'docProps/custom.xml', 'xl/drawings/drawing1.xml', 'xl/drawings/_rels/drawing1.xml.rels', 'xl/media/image1.jpeg', 'xl/sharedStrings.xml', 'xl/styles.xml', 'xl/workbook.xml', 'xl/_rels/workbook.xml.rels', 'xl/worksheets/sheet1.xml', 'xl/worksheets/sheet2.xml', 'xl/worksheets/sheet3.xml', 'xl/worksheets/sheet4.xml', 'xl/worksheets/sheet5.xml', 'xl/worksheets/sheet6.xml', 'xl/worksheets/sheet7.xml', 'xl/worksheets/sheet8.xml', 'xl/worksheets/_rels/sheet8.xml.rels']

Solid-Energy-Systems / NewareNDA

Key Error: data.ndc #91