fatiando / community

Community resources, guidelines, meeting notes, authorship policy, maintenance, etc.
Other
8 stars 4 forks source link

Reduce the size of our built packages #154

Open leouieda opened 3 months ago

leouieda commented 3 months ago

Description:

At the moment, we put a lot of stuff in the packages we upload to PyPI: actual code, tests, test data, doc sources, etc. For installed packages, the only thing that's actually necessary is the actual code. We used to package the tests so that we could run the tests on the installed package but we never do that in practice. This results in our packages being larger than they need to be and wasted bandwidth.

In the spirit of frugal computing, I'd like to propose:

  1. Only build packages that have the actual code and the LICENSE.txt file. The supporting .md files don't need to be there since nobody looks at them in packages anyway.
  2. Remove the tests folder from inside the package and place it at a top-level tests folder. This will require editing our Makefile but it shouldn't be a big deal.

Apply to:

Need to update the contributing guide in:

Further instructions:

We want your help!

We know that maintenance tasks are very demanding, so we don't expect a single person to tackle this issue by themselves. Any help is very welcomed, so please comment below that you want to take care of the changes on any repository and we will assign it to you.

leouieda commented 3 months ago

Triggered by https://github.com/fatiando/pooch/pull/423 and https://github.com/fatiando/pooch/issues/416

penguinpee commented 3 months ago

I think that's a good idea. To make life easier for yourselves when it comes to automatic discovery, I'd suggest switching to a flat layout. With the current layout and automatic discovery it's actually quite hard to exclude the tests.

With a flat layout the doc(s)/ and test(s)/ directories would be excluded by default.

penguinpee commented 3 months ago

On second thought and after having had a quick shot at the flat layout with pooch, the src layout might be more suitable, less error prone if you want to make use of automatic discovery.

Of course, that's not a requirement and being explicit regarding what to include and what not is also an option. For pooch the data/ and paper/ directories prevent use of automatic discovery with a flat layout. Of course those directories could be moved into a directory that is on the DEFAULT_EXCLUDE list.

I'd be willing to open a PR (or amend my current PR) for pooch. However, I feel the decision on which layout is preferred should come from the maintainers.