PydPiper / pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library
https://pylightxl.readthedocs.io
MIT License
302 stars 47 forks source link

a question in readxl_get_workbook #65

Closed wangdaye078 closed 2 years ago

wangdaye078 commented 2 years ago

def readxl_get_workbook(fn): ....... for tag_sheet in root.findall('./default:sheets/default:sheet', ns): name = tag_sheet.get('name') try: rId = tag_sheet.get('{' + ns['r'] + '}id') except KeyError:

the output of openpyxl can sometimes not write the schema for "r" relationship

        rId = tag_sheet.get('id')
    sheetId = int(re.sub('[^0-9]', '', rId))
    wbrels = readxl_get_workbookxmlrels(fn)
    rv['ws'][name] = {'ws': name, 'rId': rId, 'order': sheetId, 'fn_ws': wbrels[rId]}

.....

readxl_get_workbookxmlrels is run many times

wbrels = readxl_get_workbookxmlrels(fn)
for tag_sheet in root.findall('./default:sheets/default:sheet', ns):
    name = tag_sheet.get('name')
    try:
        rId = tag_sheet.get('{' + ns['r'] + '}id')
    except KeyError:
        # the output of openpyxl can sometimes not write the schema for "r" relationship
        rId = tag_sheet.get('id')
    sheetId = int(re.sub('[^0-9]', '', rId))
    rv['ws'][name] = {'ws': name, 'rId': rId, 'order': sheetId, 'fn_ws': wbrels[rId]}

is run faster

PydPiper commented 2 years ago

@newhying thank you for submitting this. I agree moving the readxl_get_workbookxmlrels outside the tool will be faster. There is a potential that external tools may write each sheet differently. I can make this update however it may get reverted if others run into issues due to this