Pebaz / nimporter

Compile Nim Extensions for Python On Import!
MIT License
824 stars 33 forks source link

[Question] How to dockerize a nimporter application? #15

Closed cgarciae closed 4 years ago

cgarciae commented 4 years ago

When deploying I'd like the nimporter to:

Is this possible? I am having an difficulity understanding how to "freeze" nimporter for production.

Pebaz commented 4 years ago

Hello, great question!

It's definitely possible! The solution outlined below checks all your boxes:

For this use case, you must use Nim code (via Nimporter) like any other Python dependency.

Basically, Nimporter was created with a strong focus on making libraries that use Nim, not applications. To fit your use case, just pull the Python+Nim code into a separate library that you can then distribute using these instructions along with your original application. The app can then rely on the Python+Nim library as a dependency the same way it relies on Flask or Django, for example.

AlexisGoodfellow commented 4 years ago

"Basically, Nimporter was created with a strong focus on making libraries that use Nim, not applications."

This may be very much related to the post of mine that I made not to long ago. The way I understood the design philosophy of nimporter was essentially "Use Nim for performance only where necessary, otherwise use Python for its broad ecosystem". Given that, I interpreted the suggested workflow to be "Write it in Python, profile it, then rewrite it in Nim only when you have to". That's how I expected to use Nim in my otherwise-Python codebase - just to handle those crucial code segments that have to be performant.

I also anticipated that those important bits of functionality would generally be app-specific. Because of the lack of foreseeable reuse of the app-specific code, I don't really want to pull out my nim extensions into another repo or library. Doing that would involve introducing unnecessary friction in the form of making library releases, ensuring correct versioning, syncing multiple repos, etc.

At the end of the day, I just want my compiled-in-the-background Nim code and my interpreted Python code to play nice together at runtime. From what I now gather, it seems that nimporter was designed with the idea that all the Nim code would be packaged together with the Python code into a single library, which then would be distributed together through a package index like PyPI.

@Pebaz Can I ask why you made the decision to focus primarily on supporting library authors?

cgarciae commented 4 years ago

@Pebaz Separating the nim code into its own library just for this is not very ergonomic, I understand why this is the current supported use-case, but since you don't need to create a package for your own deployment it feels like an overkill.

I think it would be extremely nice if nimporter had a NIMPORTER_ENV flag where if set to prod, nimported would create the python modules on import using *.so files stored in a e.g. nimporter_dist folder. The creation of this folder could be handled by the CLI. I am guessing that around 80% or more of the work needed to get this working is already there.

Pebaz commented 4 years ago

@cgarciae @AlexisGoodfellow

Thank you very much for these great questions!

I believe I have not understood the root issue here. Nimporter definitely supports both apps and libs.

The way it does this is through the use of a setup.py for both. The setup.py is the distribution mechanism provided by Python that works for both apps and libs seamlessly.

I chose to use this in favor of a custom solution because there is already so much infrastructure to support it (PyPi, Wheels, etc.).

To address the root question: "How do I use Nimporter without putting my Nim code in a lib dependency?"

Nimporter fully supports this!

In order to do this, your application must be organized around the setup.py standard. What I mean by this is that the "entry_points" keyword argument to the setup() function allows you to create applications that you can install onto your local installation of Python the same way you would install a library.

This is a key element here.

This means that your codebase that contains Nim and Python code can be packaged into a binary (recommended) and source (includes Nim files, must have Nim installed to use) distribution using python setup.py [sdist | bdist_wheel].

The resulting .zip or .whl does not have to be uploaded to PyPi. You can store this on any artifact hosting solution.

For specifically containerization using Docker, there are a couple options for this:

  1. Package a binary distribution (wheel) and reference this in your Dockerfile which can then install it first thing.
  2. Run your application once (which will produce the .so files in __pycache__) and then make sure the __pycache__ dirs are available within the Docker container so Nimporter can use these instead of forcing a recompilation.

I would personally prefer option 1 since option 2 is less formal. However, it will definitely work since the Nim files alongside your Python files will simply be used to lookup their .so equivalent in __pycache__.

However, it is possible that a far better solution exists.

I am highly interested in your normal deployment process.

Can you run through how you normally distribute Python applications (not libraries)? I'm assuming that it is different than using the setup.py approach.

I appreciate your interest guys and look forward to hearing your thoughts on this!

cgarciae commented 4 years ago

@Pebaz thanks for taking interest!

The "normal" deployment strategy for python code is:

  1. Copy the code inside the container
  2. Install dependencies in the docker container from the requirements.txt file.
  3. Run the main program with the python command or in the case of web application using gnunicorn or uvicorn.

From your options, since you normally don't package + install your own project then option 1 is less attractive. I think option 2 is actually more in the desired direction but it could use better cosmetics. Would it be possible to:

  1. Have nimporter look for files in a different directory other than __pycache__, say nimporter_dist? __pycache__ could take priority over nimporter_dist so development works as expected, when deploying you only copy nimporter_dist so it will use that one.
  2. Have a CLI command that can create the .so files without running the python code? Unifying this with the previous it could output them to nimporter_dist by default.
AlexisGoodfellow commented 4 years ago

With respect to point 2, I think you could just run nim c to compile the Nim source to .so files, which you could then move around however you wanted (assuming that point 1 is possible and nimporter can find the shared object files at the new location).

AlexisGoodfellow commented 4 years ago

Also, with respect to the question posed by @Pebaz - our deployment strategy is exactly the same, with the additional wrinkle that the deployed docker containers are managed by kubernetes. That shouldn't be a relevant factor here, though.

Pebaz commented 4 years ago

@cgarciae @AlexisGoodfellow

I have taken the time to review the information you guys have sent me and have come up with a solution that could work but I wanted to hear your thoughts on it.

Nimporter was not designed to be used without packaging the application using setup.py.

However, with a bit more effort, I believe that Nimporter could be used within Docker easily.

@cgarciae I believe you have come up with the solution that could work regarding having a command to build all the .so files at once rather than running the application once/packaging using setup.py.

I do not wish to modify Nimporter to support a nimporter_dist folder, because a primary use case of Nimporter is to be as hidden as possible and cache builds in the __pycache__ directory within each Python package.

However, without a command to build all of the Nim modules/libraries in a project ahead of time, it is difficult to use Nimporter for use cases such as Docker containers.

Here is my proposed solution:

Implement another CLI option that recurses through and builds all artifacts and stores them in the normal __pycache__ dirs which will allow you to easily copy them along with the Python code into the container using a Dockerfile:

$ nimporter compile

This would accomplish the goal of preparing a given project for distribution without relying on a setup.py.

So in summary, the entire process would then be:

  1. Write a Python app that imports Nimporter and a Nim module
  2. Run nimporter compile to build all Nim files into .so files
  3. Ensure that both the Python code directories and the __pycache__ directories are copied to the container when built
  4. Run the Docker container as usual (Fargate, etc.)

Note that this method requires that the build machine running nimporter compile will produce binaries that are compatible with the target Docker container (arch, compiler version, etc.).

What do you guys think?

cgarciae commented 4 years ago

@Pebaz I think this works! Tthe changes overall are minimal and if well documented using __pycache__ for production has the advantage that its the same code and structure for production and development.

Documentation around the nimporter compile command should mention the strategy detailed here for folks wanting to deploy their applications.

AlexisGoodfellow commented 4 years ago

I also think this will work. @Pebaz If you'd like some extra hands, I'm available on Friday to potentially work on a PR to do this!

Pebaz commented 4 years ago

@cgarciae

Ok great! If this will work for everyone I'll go ahead and make an issue for it and close this one.

@AlexisGoodfellow That would be fantastic, I really appreciate it! As far as I remember, most of the functionality you need is in these functions:

It would probably be useful to run nimporter_cli.clean() first since by definition nimporter compile will build everything anew.

Also, I think I'll be able to work on this on Saturday so if you want, I can review a PR or improve documentation or whatever needs to be done on that date.

Thanks a lot guys for taking an interest in Nimporter and I'm glad we were able to come up with a solution that would work!