ctapobep / blog

My personal blog on IT topics in the form of GitHub issues.
6 stars 0 forks source link

Can Python handle web/enterprise projects? #19

Open ctapobep opened 1 year ago

ctapobep commented 1 year ago

Probably the best way to characterize Python is.. it's programming for non-programmers. While it's a popular choice for quick, right-now-and-right-here scripts, can it actually handle web/enterprise software that serves its users for years? I’ll try to convince you that it would be a very poor choice for such a task.

What’s important when comparing platforms?

First, what is a platform? It’s a sum of the supported language(s) + SDK + community-written tools, libraries, frameworks.

I'll be mostly comparing Java vs. Python and JavaScript vs Python. But whenever you see Java, you can probably think of .Not too - both of these are high quality platforms with very experienced communities. JavaScript is here because it's a scripting language, same as Python. And in some aspects it's easier to discuss the design choices of Python in that prism.

When comparing Java and Python many people would say that Python is slower but simpler (probably, referring to the simplicity of Python2). But I don't think that performance is the biggest problem of Python. After all, a typical web/enterprise app spends most of its time querying databases. So good DB architecture and optimal querying should mostly be sufficient to write a well performant app.

Same when it comes to the syntax.. Who cares, right? After trying Java, Python, JS, Ruby, Groovy - this seems to be irrelevant. There are some personal preferences of course, but in the end I wouldn't put much emphasis on it.

We also will not discuss dynamic and static typing. I think everyone, who tried both, understands that static typing wins hands down. It’s way faster and easier to write apps if you have compile-time checks at your disposal. But since this topic is too popular on the web anyway, why repeat the internet.

The biggest problem of Python in my opinion is in its weak (in terms of programming skills) community and its poorly designed platform. Most of the tools needed to create apps are riddled with bad choices. And what's more surprising, Python's SDK and built-in tools in some cases are no better. Now let me list some examples.

Building and dependency management

Managing dependencies

This is one of the Achilles heels (yep, plural) of Python. You can’t just tell Python “I want this package to be available, but not the other”. Everything inside site-packages is available to any script or app! Which also means there’s no way to have multiple versions of the same library (used by different apps).

Therefore there are plenty of tools that “solve” these problems (pyenv, venv, virtualenv, etc) using horrible hacks like “let’s create a virtual env for this project with its own copy of python (it’s not even a reference!), and throw in some files that override the default python modules (site.py) - and that’s how we isolate our dependencies”.

The reason all packages are always available (I think) is because when you’re scripting, you don’t really want to bother defining a list of dependencies. Whatever you installed - let’s make it available to whip up the script faster. NodeJS has a similar philosophy, but they made it straightforward and standardized: each project can have its own node_modules/ right in it. So for scripting you can still install things globally, but when working on apps you can isolate them easily by placing their own dependencies within the project boundaries.

Java, on the other hand, has an even better solution. Every time you start JVM, you can explicitly list the dependencies that are available to it. So tools like Maven (most widely used for building Java projects) can keep all the dependencies on your machine in the same folder. And when building a project, it can simply reference the right version of the right lib. No need to duplicate dependencies or manage any virtual envs!

Additionally, NodeJS and its NPM make it possible to use multiple versions of the same dependency in the same project. Which may solve some conflicts, but overall this problem should be solved differently (more about it later).

Build tools and project structure

Python authors introduce specs to standardize something funny like how many empty lines there should be between 2 methods, but they can’t standardize something important like project structure.

Here’s what Java community has:

In Python:

No IoC/DI tools

In fact there may be, but they aren’t integrated with MVC frameworks. What’s worse, it doesn’t seem like the community understands what these tools are for. FastAPI has a feature called Dependency Injection. That immediately lightened the mood, until I tried it.. It appears it’s not a Dependency Injection at all! They decided to call a method that returns a value (later can be used as a parameter in the request-processing method) a Dependency Injection! So disappointing..

DI is important for our apps because it allows us to move away from Singletons, Factories, Registries. This makes the code cleaner and more modular. In particular, it allows us to override which objects to pass from the outside in a different situation (e.g. in tests, or in some non-standard environment).

No interfaces

This is the other Achilles heel of Python (Python is pipedal, who would’ve thought ¯_(ツ)_/¯). Problem is: this doesn’t allow defining standard APIs easily. So what we end up with is many libraries not following the specs! This causes further problems: the lack of interfaces and weak standards means you can’t build generic tools upon those abstractions.

If you tell a Java developer that there are DB connectors that don’t follow the standard (in Java it’s JDBC), they wouldn’t believe you. For them it’s a completely alien, unnatural idea. So what we have in Java:

While in Python there’s DB-API too, it doesn’t come in the form of a library with interfaces - it’s just a text document. Which results in many libs not following the spec. And those that do follow - who knows how well they do? And so we end up with a DB Connector (like psycopg2 for PG) that comes with its own DB Pool! So what, each connector should come with its own set of additional utilities? Can’t we separate these responsibilities so that the same tool could work with any DB?

This ability to reuse code is one of the superpowers of abstractions, but the Python community ignores it like it’s not important.

Breaking backward compatibility

The third Achillis heel (what is this creature?!). I can understand the other problems (we don’t always have time and budget to make high quality decisions). But this one is just negligence. A lot of very complicated problems can be solved by “just” keeping backward compatibility.

Remember the problem with one lib requiring dependency version A, while the other library requiring version B? That won’t matter if version B keeps backward compatibility!

Hence no need to specify version ranges that your library supports:

Since Python breaks its backward compatibility readily, so does the community. Why on earth would you abandon support for some Python 3.6? Why do I have to upgrade my Python just because your library decided to use some feature that saves you one line of code? What if there’s another library that’s incompatible with the new version of Python (because Python decided to break the compatibility too)?

In Java backward compatibility is one of the most important virtues. If Java had 10 commandments, DO NOT BEAK BACKWARD COMPATIBILITY would be the 1st one. This is what allows me to easily start an app that I didn’t touch for years.

If you write some short scripts just for your own sake, fine - do what you want. But if you’re building an app or a library, it must be maintainable for years. And keeping backward compatibility is one of the most important steps towards that.

Absence of annotations

Yet another situation when Python decided to mix two concepts: annotations and what you do with it: @and_i_mean_decorators. If there are strong reasons to have decorators - fine. But give us some other way to annotate the fields without assigning annotations any particular meaning/implementation. This will allow the community to build many nice libraries and frameworks!

Without annotations, how do you add metadata to your fields, classes, methods? And without that how do you configure your frameworks? A common theme in Java for instance is for JSON-serialization libs to rely on annotations. I mean things like that:

@JsonName(“myFieldName”)
int my_field_mame;

Do you know what Python developers have to do instead? They create static fields with the same name as instance fields and assign some values with configuration to them. The world’s gone crazy.

Logging

This is some weird problem, but I’ve never seen any Python library that writes logs properly. It’s hard to qualify what “properly” means in this context, but no matter how you define it - Python libraries don’t do it.

Here is something that’s expected (and typically not done) in terms of logging:

Also, Python authors, please add this super important feature to your logger: MDC (mapped diagnostic context).

Poor design choices all over the terrain

I’ve tried some popular tools in Python, and how do I put it.. Recently I started to experience issues with my jaw joints. And I’m pretty sure it’s because of how many times my jaw dropped. Let me list a few prominent examples that I remembered, they all talk to the quality of code the community produces.

Pydantic

I think the train of thought was: Let’s create a library to de/serialize objects from/to JSON. Done! Too simple.. Let’s also add validation on top of it. Hm.. Why should validation be a part of the JSON serializer, and not a completely separate library that can be used outside of the JSON theme? Weeeell, we’ll concoct some story that it’s relevant because.. Okay, okay, we’ll have all the time in the universe to figure out why we want to mix these concepts. … after the whole time in the universe ran out… Let’s also add an ability to read environment variables and configure the app!

I didn’t look at the commit history and I don’t know if that was the actual sequence of steps. In fact, I’m quite sure that validation came first, and exporting to JSON was added later (which makes it worse if you think about it). The point of the story - you need to separate the concepts! These are 3 completely unrelated use cases and there should be 3 completely separate libraries.

It should be possible to use just one capability so that other libraries/apps could depend on this one for the right reason. If in my code I need validation, but I have to pull code related to JSON and settings-parsing, I’d think twice. In fact, I wouldn’t even think - I wouldn’t trust it for sure.

Also.. Using inheritance? Yikes. In the OOP realm, inheritance is frowned upon and is used only as a last resort (look up Inheritance vs Composition).

FastAPI

I don’t understand why not just look at what other platforms have implemented (like Spring MVC). From the design perspective, it’s an already solved problem. Why create tools that are worse than what’s already out there?

A list of problems:

Overall I’m more surprised that the Python community hasn’t created these frameworks sooner - this seems like an obvious need.

Alembic

Tools for DB Migrations are relatively simple, and they are relatively easy to implement. So why, why, why would you depend on a tool that’s 100x times more complicated (SQLAlchemy) than the problem you’re trying to solve? It’s like hunting a fly with a bazooka.

What’s worse - there isn’t much choice. The community hasn’t created any good DB Migration tools yet. The closest to what we really need for a typical web app is yoyo-migrations. But it’s quite young and still needs important features to be added (like checking that already applied migrations haven't changed). Yoyo team, if you ever read this, please check out Flyway - it’s awesome. And remember we were talking about DB-API? DB Migration tools must depend on it, and not create their own connections by their own means! Especially if the connection details are passed as a string which makes it impossible to have password with special symbols o_O

But Java also is not perfect!

Okay, to be fair - Java has its problems too. E.g. - Logging is also a problematic part. For a different reason - there are too many logger libraries. It’s not that trivial to set up logging so that all the libraries redirect their logging to a single one, which is configured by our app. - Java community overall is getting weaker by the day. There’s a pesky parasite in the community - SpringSource. They produce very popular tools and therefore they have a lot of impact on the community. And in the recent 10 years they have been corrupting the community from within, creating crappy tools & approaches like package scanning, and finally Spring Boot. Violating basic good practices like “explicit code is better than implicit magic”. Because of this and because of how much one has to learn in Spring, newcomers spend time studying useless concepts. They are just soooo bad at programming, it’s crazy.

Summary

Honestly, I haven’t spent much time programming in Python. So I could list only obvious, very noticeable problems. I’m sure there are many-many more. And this makes Python a very bad choice to write web/enterprise apps that have to be developed and maintained for years.

It’s certainly much slower than writing in Java because of no compile time checks, constant compatibility issues, bugs, complexity that comes with bad decisions. But you're also getting used to working in a low-quality environment. Which means that you’ll never learn how to code properly.