banillie / analysis_engine

Place for all code used to compile the quarterly PfM Report, manage GMPP data, as well as other useful data searching/analysis functions.
MIT License
2 stars 2 forks source link

Preparing for packaging #13

Closed yulqen closed 3 years ago

yulqen commented 3 years ago

I'm opening an issue to discuss/list what needs to be done to be able to create the package for PyPI.

The defacto guidance for Python packaging is, funnily enough, Packaging Python Projects.

Goal

The primary goal is for the user to be able to download and install the package from PyPI and interact with your code on the command line, for example:

pip install analysis-engine

and then:

engine vfm -s or ae vfm -s

(You choose what to use as the 'program' word here - but it should be one word and short).

Considerations

Python packaging is, and always has been, a bit of a dog's breakfast. It's a lot better than it used to be (apparently) but there are still a lot of moving parts and it's complicated. Our goal here is to make it as simple as possible whilst taking the opportunity to learn some fundamentals that will be useful in future.

I'm learning here - I've not done this very often.

Python packaging has changed a lot in recent times and is changing rapidly right now. There are new tools to use and new conventions. All of which is good stuff, but in the interest of keeping it simple, I think we should try to use a fairly traditional approach, which basically follows the guidance at the Python Packaging Projects site linked above.

We basically only need two or three tools to do the job: setuptools, wheel, pip and twine. setuptools comes bundled automatically in every Python virtualenvironment (if you do pip list when you create a new virtualenv, there it is) and does most of the work. twine is used to upload the necessary files to PyPI and pip for installing, which you know all about. wheel is pip-installable and used to make wheels! Don't worry about it for now - a wheel is a distrubution file type for Python.

Immediate things to do

Before you start, you need to do some basics. One of which is going to be a bit of a pain, but with refactoring in Pycharm, it shouldn't be too bad.

  1. You need a directory structure that meets that set out in the example at the top of https://packaging.python.org/tutorials/packaging-projects/. Basically, all your executable code needs to live inside one package at the root of your project. That means creating a new directory, called "analysis-engine" with a __init__.py file inside it. And then moving ALLLL you current package folders inside that.

Look at https://github.com/yulqen/bcompiler-engine. At the root of that project, there are all the auxiliary files, such as .gitignore, LICENSE.md, etc, and a single directory called engine (ignore scripts - they're auxiliary files too). That engine directory is the package that contains all the other subpackages needed to do the work. And that is why if you pip install bcompiler-engine in YOUR project, you would have to write from engine import X to import code. engine is the primary package that you get access to when you install in your virtualenv - everything else sits below.

NB: I should have named the engine package bcompiler-engine, but I didn't. It should be the same name as your project on PyPI to prevent clashes.

I'm not 100% sure if this is necessary for a project that intends to be run primarily as a CLI (such as analysis_engine), but it's definitely what you should do if you want to be able to import from your package. It's good practice and it probably needs to be done, so let's just do it.

  1. Keep tests OUTSIDE that new analysis-engine package.
  2. Make sure all your tests pass and your code runs when you have done all that refactoring.

We can recommence after that!.... :-)

banillie commented 3 years ago

Excellente Matt. Will get on it.

I’ve been refactoring the code and files into a much better / more structure format, and there are loads of old files in there which need to be culled, and I was going to do that fairly soon anyway. So this has come at a good time.

Think I’ll create another git branch for this work. Good idea?

banillie commented 3 years ago

@yulqen I think I have completed all the immediate actions. check out the packaging branch of this repo. That branch is running on my computer and all the tests are passing.

Are you free to discuss sometime this week? we could also use that time to discuss the threads re the latest version of datamaps. Cheers.

yulqen commented 3 years ago

What's the thinking with the other package (which is terrible name for a package, Will!)? Are you intending to only include the analysis_engine package in the PyPI distributable? If so, what's all the code in other for - is that just test stuff?

banillie commented 3 years ago

other is a load of old code that I need to refactor, but I haven't delete just yet as it might be useful in the next month of so; when I get around to the refactoring. But it will go eventually. Yes only analysis_engine to be included in the package. I can change the structure though if makes more sense to do it another way.

yulqen commented 3 years ago

Remind me to discuss next steps when we converse.

banillie commented 3 years ago

@yulqen closing this one as well, as also completed! happy days.