dfinke / ImportExcel

PowerShell module to import/export Excel spreadsheets, without Excel
https://www.powershellgallery.com/packages/ImportExcel/
Apache License 2.0
2.45k stars 393 forks source link

Refactor Question - Improving ImportExcel module import time #1531

Open joshooaj opened 11 months ago

joshooaj commented 11 months ago

Hi @dfinke,

Are you open to a PR to change the way the module is "built" for testing and publishing to reduce the amount of time it takes to import the module?

On my Windows 11 laptop it takes ~7.5 seconds to import the module in PowerShell 5.1 as published to PSGallery. The bulk of that time is spent on dot-sourcing the .PS1 files in the Private, Public, Charting, InferData, and Pivot subfolders. The next longest import step is importing plot.ps1 which only takes ~98 milliseconds.

In a quick and dirty test where I take all the contents of those subfolders and embed them in the .PSM1 file, the import time drops to only 362 milliseconds. The issue is less noticeable in PowerShell 7 where the current import time is only 1.1 seconds, but "compiling" all the functions into the .PSM1 file still reduces the import time by 20x, down to 53 milliseconds.

I haven't started yet but the idea would be to keep the current project structure, and change the build/CI process so that the version of the module being tested and published is "compiled" by inserting as much as possible into the PSM1 file beforehand.

I use this pattern another module where there are a total of 197 files in the Classes, Public, and Private folders with each function residing in its own file. The published version of the PSM1 file has over 15k lines and imports in ~3.6 seconds.

dfinke commented 11 months ago

Thanks for the question. Have thought about that for a while. I'd be open to it.

I'm running Win 11 both version of PS import using -force in ~2 sec.

I have not thought through the scenarios where bugs could arise or debugging efforts.

joshooaj commented 11 months ago

Okay, I'll get started! What's your current CI? I see references to Azure DevOps, AppVeyor, and GitHub Actions and I'm not sure if DevOps/AppVeyor are still being used or if those references are stale.

dfinke commented 11 months ago

Thanks, really appreaciate it! GitHub Actions is preferred. Az is still enabled.

Note: I'm not sure how I can get this into enough folks hands or forewarn users of this sizable and potential impact. Also, I an not confident of the test coverage to uncover issues.

Additionally. I have other commitments. I may not get to review this for some time.

Given these risks, it could be months before I decide against it or merge the PR.

joshooaj commented 11 months ago

Thanks for the notice - I can understand that. If you end up merging any of it, maybe it marks a major version increment to signal that there might be breaking changes (even though the goal is that there aren't).

Looking at the repo a bit closer, there are some other structural changes I think could improve the developer experience, including the addition of a pwsh devcontainer. I don't think any changes I have in mind would make a difference to module users other than to improve the import time, but I know any change like this can have unintended side effects.

joshooaj commented 11 months ago

I may have gotten a bit carried away... however, the tests pass on all three OS's and the import times are significantly faster when importing the "compiled" copy. I won't send a PR until we've had a chance to chat about all the changes, and to make sure that the CI pipelines and dev experience are 100%, and I understand that may not happen for a while - no worries. I may take a look at some of the open issues or submit a couple bug fixes separately.

Here's a rundown of the changes which you can view at joshooaj/ImportExcel. One of the neat additions is the devcontainer so that you can launch a codespace on GitHub or build in a devcontainer on your local machine if you have docker installed. You should try running the command mkdocs serve from a codespace or devcontainer - it'll start serving a dev copy of the ImportExcel docs site which is (for now) published to GitHub Pages under my repo.

dfinke commented 11 months ago

Sweet! If you're up for it. I can schedule a zoom chat and you can level set me so I can let it percolate.

joshooaj commented 11 months ago

Sounds good, I'm free today and pretty flexible any other day so feel free to shoot an invite over to joshooaj at gmail any time 😎