AVSLab / basilisk

Astrodynamics simulation framework
https://hanspeterschaub.info/basilisk
ISC License
144 stars 61 forks source link

remove need for `supportData` folder in BSK wheels #814

Open schaubh opened 1 month ago

schaubh commented 1 month ago

Describe your use case Right now building a wheel includes the large Spice files in supportData folder. These are really only needed when running some scenario scripts. Instead, I propose that this folder be moved to inside the examples folder. This should reduce greatly the BSK wheel size, making it easier to host on PyPi among other things.

If users want to run the examples scripts and download this folder, they would get the required files except for the large Spice files. These would have to be manually downloaded if the user is not building BSK using cmake which would download them. I think this is manageable with good documentation and warning messages if the file is not found and can't be loaded.

Describe alternatives solutions you've considered I can't think of any alternatives right now, open to ideas.

Additional context The support data is not explicitly required by any module to compile. What data to load can be set by the user and we are providing handy defaults options. I'll have to play with a test branch to see what unexpected challenges arise from this approach.

schaubh commented 1 month ago

Howdy @dpad , I have been able to expand the CI test actions to now cover a range of python and Linux versions, as well as Window with opNav, and macOS with opNav and one action on macOS without vizInterface. The macOS with opNav script also test builds the documentation. My next step is to try to remove the need to have the dataSupport move to the examples folder and not be included in the BSK build per say. I'll see if I have time to play with this idea on a branch this weekend. Do you see any issues with your build method with this approach. Or, is there an impact on upgrading our use of conan from 1.0 to 2.0? In your other branch you already had 2.0 functionality working.

dpad commented 1 month ago

@schaubh Sorry about the delayed response, I am currently traveling.

I think there's no issue in upgrading conan to 2.0, as you say I had it working on the other branch. I think there was only some minor issues in versions and options for the dependencies that we specify in the conanfile, but I just had to pin those to an appropriate working version.

Regarding supportData, yes, if we only need the data for examples, then they don't need to be included in the built wheel. I don't think there would be any issues with the build system, we would just need to change which files get included in the appropriate pyproject.toml settings. The issues would only be during usage I think (e.g. what happens if a file that Basilisk expects is missing, should it give an error at run-time or initially during configuration of the simulation, should we provide a method to automatically download the data, and if so from where, and what happens if there are networking issues, etc.)

Regarding wheel size, one thing I noticed is that there's a lot of duplication in the compiled module files (I think because of the way we essentially copy-paste the messaging library code instead of linking to a shared library, for example see the auto-generated .cxx files for messages). I mentioned before that we should use cibuildwheel to create wheels compatible across lots of different systems at once. One of the things this does is to run auditwheel repair on the wheel file to check and fix up the compatibility of the wheel. I realised that you can run auditwheel repair --strip on the wheels to remove a bunch of unused symbols from the compiled module files (the .so files) -- when I was testing this it reduced the total wheel size to less than half.

schaubh commented 1 month ago

Thanks for the info. I'll looking into the SupportData ideas over the next weeks, and good to know about the audit wheel repair --strip suggestion. I'll try that. That might get us closer already to our target of having a BSK wheel that is less than 100Mb if possible. I'm on travel a lot over the next 3 weeks, so my productivity will be a little slower ;-)

schaubh commented 5 hours ago

@dpad , my test branch feature/move_support_data has moved the supportData folder to examples/supportData. The test and scenario files are updated to load data files from this folder. When I build a wheel for macOS, the size has shrunk from 219Mb to 62.7Mb. As this is less than 100Mb PyPi limit, this now enables us to start looking at having builds uploaded.

I test this by:

  1. copying the wheel file to a new folder called basilisk
  2. created a new virtual environment with venv
  3. did a pip install of this wheel
  4. copied over the examples folder and the scenarios still ran fine.

Note, for the scenario scripts to find this examples/supportData folder it does assume the parent folder is called basilisk. If the user wants to load data from another folder they would need to create to set module data path to their own data folder.

@dpad and @sassy-asjp , thanks for your thoughts on this solution. I know moving this data folder will break scripts, but I plan to write up this issue in "known issues" document with clear guidance on how to correct this. Having wheels now by 69Mb is a huge benefit I think for the distribution of Basilisk?

schaubh commented 5 hours ago

Mm, looking at this now, I wonder if I made this harder than it should be. I could leave supportData where it is in the root basilisk folder, remove it from being included in wheels, and all current BSK scripts would still run. If someone installs BSK wheel (without support Data) in a new installation, they would have to pull the supportData folder from the repo anyway, including custom downloading the de430.bsp spice file that we have cmake download when building BSK.