beeware / briefcase

Tools to support converting a Python project into a standalone native application.
https://briefcase.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
2.66k stars 372 forks source link

Maximum number of files allowed reached #646

Open MosGeo opened 2 years ago

MosGeo commented 2 years ago

Describe the bug The following error shows up:

Linking application installer...
light.exe : error LGHT0306 : An error (E_FAIL) was returned while finalizing a CAB file. This most commonly happens when creating a CAB file with more than 65535 files in it. Either reduce the number of files in your installation package or split your installation package's files into more than one CAB file using the Media element.

To Reproduce Try to package an application with large dependencies that contains a lot of files (more than 65535).

Expected behavior Briefcase would package the application and split the cab files automatically.

Environment:

freakboy3742 commented 2 years ago

Thanks for the report! Out of interest - what are you doing that has produced 65k files in a project? If this is something that is easy to accidentally trigger with a simple project, then this will require a more urgent response. If it's an edge case caused by one particular project that just happens to have a lot of files, it still needs a fix, but it's not as urgent.

MosGeo commented 2 years ago

@freakboy3742 thanks for the reply.

I am bundling a customized version of napari, which also uses briefcase. For various restrictions and the fact that my users do not code or deal with wheel files, I have to have a simple executable with all the required packages for analysis. I am using some large packages as you can imagine (computer visions problem with image processing and ML).

Is this something that is accidently triggered: No, so technically it is not an urgent issue for briefcase (shoots self in the foot🤣).

Still, I hope somebody who is familiar with briefcase grabs it when they have time 😊.

freakboy3742 commented 2 years ago

Thanks for the extra detail - that's very helpful.

One approach that might be worth considering in your case is zipping the modules being used. It is possible for Python to load modules from inside a zip file, rather than requiring a full directory structure. The standard library is sometimes packaged in this way (as a single python3X.zip, rather than a whole directory of modules). From the sound of it, this might work in your case. IIRC it should be a simple as adding the zip file to your PYTHONPATH... but I might be forgetting some details there.

MosGeo commented 2 years ago

I'll add these here for future reference:

WIX 4 added a way to split the containers: https://github.com/wixtoolset/issues/issues/6144

it seems that there is a plan for automatic splitting in WIX https://github.com/wixtoolset/issues/issues/6521 so briefcase will get it by default (as I understand that briefcase uses wix).

Also, relevant stackoverfow: https://stackoverflow.com/questions/55295201/wix-toolset-bundle-with-total-content-size-2gb

MosGeo commented 2 years ago

So here is an update on this issue. I have been experimenting with a number of methods to overcome it. Here are the details:

Using ZipImport

The idea here is that python can import from zipped packages like what @freakboy3742 suggested. This looks like an attractive solution as it is simple and easy to implement. The two things required: 1) add the zipped package to the python path explicitly, 2) import normally using import (not zipimport). You can write a quick utility to go through the app_packages folder and add any of the packages that are zipped using the code below.

Sounds simple enough. Unfortunately, some packages would want to access their folder manually for some reason or another and they cannot do that if it is zipped. For python files, it works but for other things, it doesn't work.

from pathlib import Path
import sys
import napari

# Update site_packages folder to fit your structure
site_packages_path = Path(__file__).parent.parent
zip_packages = site_packages_path.glob("*.zip")
for z in zip_packages:
    sys.path.append(str(z))

Multiple cabinets

It seems that wix 3.6 onward has an element called MediaTemplate. Briefcase template currently is using the Media. MediaTemplate would allow the user to create multiple cabinets automatically based on file size. So by changing the one line in the main wix file from

<Media Id="1" Cabinet="product.cab" EmbedCab="yes"/>

to

<MediaTemplate MaximumUncompressedMediaSize=512 EmbedCab="yes"/>

you can subdivide the cabinet to multiple ones. You can verify this by setting EmbedCab to no for a large package. More information on MediaTemplate can be found here (https://wixtoolset.org/documentation/manual/v3/xsd/wix/mediatemplate.html). This is all good and now, the installer will be created. But it seems that the installer is very slow to start (https://stackoverflow.com/questions/52520739/wix-msi-with-large-number-of-files-takes-before-welcome-dialog-is-published). On my test, it got stuck at CostFinalize standard action. So this solution is a no go too.

Zip and Custom Actions

The basic idea here is to package the app_packages as a zip file and extract after the installation. You can have the installer extract the zip automatically by inserting a custom action in the wix file. Now, I can delete the contents of the original app_packages and it will be filled automatically by the installer.

<CustomAction Id="ExtractAppPackagesZip"
      Directory="helloworld_ROOTDIR"
      Impersonate='yes'
      Execute="deferred"
      ExeCommand="tar -xf app_packages.zip"
      Return="check"/>

<InstallExecuteSequence>
    <RemoveExistingProducts After="InstallValidate"/>
    <Custom Action="ExtractAppPackagesZip" Before="InstallFinalize">NOT REMOVE</Custom>
</InstallExecuteSequence>

This seems like an ok solution for the time being. I was not able to delete the zip file in the installer so I just quickly check for it and delete it in my application when it starts to save space.

Zip and startup script

Similar to the one above but the extraction is happening from the python side. If the software doesn't find the app_packages folder, it would create it from the zip at the start.

Final remarks

I suggest the use of MediaTemplate instead of Media as it will not effect smaller applications but will help some larger ones.

freakboy3742 commented 2 years ago

Thanks for that analysis - very thorough and helpful.

I broadly agree that using MediaTemplate seems a reasonable approach that uses the tools that WiX provides to resolve what is ultimately a WiX packaging issue.

The "slow to start" issue is a little concerning - how slow are we talking? Is this something that only affects large projects with lots of files (in which case, a slow start is somewhat to be expected); or will a "hello world" project also suffer a slowdown?

The last comment about getting stuck on CostFinalize is also concerning... can you elaborate on this? Why is MediaTemplate your suggested approach if, by my reading, it didn't fix your problem?

MosGeo commented 2 years ago

@freakboy3742

Media vs MediaTemplate

There are two separate issues here that Media won't be able to handle:

  1. Installer larger than 2GB.
  2. Installer with more than ~65,000 files.

Of course, in some instances, you can have both. Now media template would be able to help with the first one automatically. For example, imagine if an installer has 30 files with 100MB, it wouldn't fit into one cabinet (limit of 2GB) and having the following would split it into two cabinets automatically. This means that you might not hit the limit of the number files either.

<MediaTemplate MaximumUncompressedMediaSize=1024 EmbedCab="yes"/>

For smaller installers (less than 1024 MB), it will be one cabinet and it wouldn't be any different than the original Media. That is why I am recommending it.

Slow To Start

This is only an issue with large number of files, so it is an edge case. It is not about using Media vs MediaTemplate. For me (80,000 files), it wouldn't even continue and show the dialogs (besides the startup preparation small box). I don't think briefcase should worry about this for the time being. It is technically a wix issue and there is not much briefcase can do (except automatically creating custom actions and zipping things).

freakboy3742 commented 2 years ago

@MosGeo Ok - so it sounds like MediaTemplate is a partial fix; a definite improvement addressing the >65k files problem, but still leaving >2GB installer problem.

On that basis, I'd be happy to add the MediaTemplate change to the app and VisualStudio templates. Fancy trying your hand at some PRs?

MosGeo commented 2 years ago

@freakboy3742 It is actually fixing both problems as now you can package things that are bigger than 2GB and more than 65k files. It is however unmasking the next problem with the >65k files which is the slowness (getting stuck).

I will check out the app and visualstudio templates and see if I can whop something up. This is my first time delving into the world of installers (specifically wix: candle and light) so this should be a good small exercise :)

Hopefully, it won't be another year before I post an update :)

p.s. for my custom action zip solution, there is still something missing which is removing the folder when uninstall. I still have to implement that.

MosGeo commented 1 year ago

Some updates on this: if multiple cabinets are used like i suggested (using the MediaTemplate, you cannot embed them in the MSI file and they would have be setting outside it. I suggest waiting until briefcase incorporate Wix 4 (#1185) as it will allow the embedding multiple cabinets in the MSI file. This will ensure the currrent behaviour (1 file MSI) is preserved.