sublimehq / sublime_text

Issue tracker for Sublime Text
https://www.sublimetext.com
809 stars 39 forks source link

Turn on basic Python optimizations #2139

Closed gerardroche closed 4 years ago

gerardroche commented 6 years ago

Sublime Text doesn't turn on basic Python optimizations (-0 and -00).

It's also not possible turn them on via environment variable PYTHONOPTIMIZE.

The side effect of this means the __debug__ constant is always True and assert statements are not disabled/stripped in the production releases of plugins.

I suggest turning them both on.

The impact of the first optimization -0, which turns on the most basic of optimizations, is in my estimation neglible (unless I'm missing something). The main things it does is set the __debug__ to False and strip assert statements. Assert statements by design are development only utilities. Most packages don't use python code and are not possibility of being impacted directory. Plugins that take advantage of assert statrements are most likley using them correctly, the remainder (if any) that might have issues are sure to be few and far between.

The impact of -00 which discards docstrings in addition to the -0, impacts plugins that use docstrings as part of their functionaility. Very few, if any. But this optimization certainly has the possibility of having a higher impact threshold than the most basic of optimaztions. But still unlikely to have much of an impact (again, unless I'm missing something crucial).

Re: https://forum.sublimetext.com/t/turn-on-basic-python-optimizations-via-pythonoptimize/32321

Thom1729 commented 6 years ago

If the impact of these optimizations would be negligible, then what would be gained by enabling them?

FichteFoll commented 6 years ago

Adding the "C: API" label since it's the most relevant one.

As mentioned in the thread, I'm -1 on this as changing something in the runtime for negligible performance improvements is usually not something you'd do lightly. As you mentioned, plugins should depend on neither of these features, but I don't see a reason for changing something that isn't broken (or obviously flawed).

The removal of assert statements would be noticeable by plugin authors during dev cycles or initial development, since they won't be available at that time either and cannot be enabled partially in a shared environment. It's also unexpected as the default Python environment has them enabled and could lead to lots of head scratching. Doc strings could be useful for helper plugins that inspect a command's docstring, for example.

I could see stripping docstrings being a memory usage improvement, but probably only by a few KB. Stripping assert statements should be immeasurable.

gerardroche commented 6 years ago

I don't see the performance benefits as necessarily negligible. I think the impact is negligible, but by impact I mean negative impact.

Dev authors should disable the optimizations. By using the environment variable PYTHONOPTIMIZE. I don't think this is a big deal for devs who are using things like assert statements, because really they probably already understand the point of them. And if not, it's not big deal groking the new change.

I don't get the head scratching. You learn it once and setup your dev box accordingly.

I see the docstrings the same again, you can disable the omptizations during dev cycles where it makes sense and matters to you.

I only see positives in enabling the optimizations.

FichteFoll commented 6 years ago

I only see more work in enabling these optimizations.

wbond commented 6 years ago

I don't foresee this happening in the current Python 3.3 environment, just because it will be a backwards incompatible change that could break a bunch of packages.

Perhaps if we introduce a new Python 3.7 env, or something along those lines, we could enable it then.

FichteFoll commented 4 years ago

This was discussed earlier on the Discord. The new environment has optimizations enabled unconditionally.

My concerns raised earlier have been cleared by there potentially being a flag to disable them on demand so that plugin authors can still used them in their code that directly interfaces with ST while debugging it. There are clearly some benefits from using optimizations in a "release" environment, which the ST plugin host arguably is, notably for ST-independant modules that can be tested externally, and with the potential of this becoming disableable and not effectively voiding all use of assert in direct plugin code, I'm +0.5 on this.

kaste commented 4 years ago

For SublimeLinter, I actually read __doc__ of the installed plugins.

Assert statements by design are development only utilities.

Erm, no. assert is used and shipped to production for potential/hazard bugs. The program MUST stop otherwise it would do bad things. You explicitly (in that scenario) don't need them on your dev machine, because none of your tests can reproduce the bug. (In the web scenario: you send the error to the monitoring system, and the user gets a 5xx.)

Practically: I have a function that only works if it runs on the worker thread. You use assert running_on_worker_thread(). Running on a specific thread is not an input to output condition, sure your best intention as a dev is that you can hold this assertion, and if you were bug free then you could remove the assert (usually manually not via -O, as I said it never happens during development anyway. Development is the happy world.).

You either do assert on_worker() or if not on_worker(): raise RuntimeError. The simple assert denotes optimistically that you have hope that you can remove that statement at one point. A library like asyncio would always raise RuntimeError in such cases because it can never be sure that a function is called in the right context, in an app you control all call sites, sure you have hope.

In Sublime, you cannot control all programs written and all authors writing, so you MUST stop after a failing assert otherwise bad things can happen. (Or parse the source code for assert and inform the user.)

deathaxe commented 4 years ago

A stateful MySpecialRuntimeError may be a better choice to selectively handle oridnary runtime errors than a general AssertionError.

It is and keeps a development only utility as part of the design-by-contract programming and should not be abused for general runtime checks for convenience reasons.

I like the description about assert expression being ...

if __debug__:
   if not expression: raise AssertionError

... which clearly states its purpose. The fact it is disabled by -O further emphasizes its meaning.

Skimming through the asyncio library quickly reveals the differences between situations an assert statement is used to catch some dev only errors and when stateful exceptions are raised to catch ordinary runtime errors.

The whole topic is also discussed at https://stackoverflow.com/questions/5142418/what-is-the-use-of-assert-in-python#5142453

kaste commented 4 years ago

It is and keeps a development only utility as part of the design-by-contract programming

That just isn't true, and asyncio is actually an example for that. It is mostly used to mark and spot bugs in your code, the bugs you actually ship to production. The convention is rather, whenever someone throws an AssertionError at you, you actually hit that edge case the developer wants to know about, please report it. (But then asyncio actually throws it surprisingly when you try to set some non event loop in set_event_loop.)

When you have a specific rule in a company/code base, you also have the gatekeeper: you either accept or reject PRs with assert statements in it. But nobody controls the programs or programmers here. And clearly nobody controls the python community.

I wonder if this is even fact driven. We have an issue with zero upvotes, some objections, so clearly just let's do it. By what margin does this make Sublime apps faster, really? Safer? More conventional? Basically nobody does the contract thing in a strict manner, so we basically rip 20 assert statements in production.

wbond commented 4 years ago

Erm, no. assert is used and shipped to production for potential/hazard bugs. The program MUST stop otherwise it would do bad things.

This is just not true, in terms of how Python is implemented. What you are describing is how exceptions works. Assertions are not checked when optimizations are turned on.

If you want to always error out, the correct tool is an exception. That is how Python was designed and implemented. Assertions are extra checks, typically used for slow checks that you want to be able to check during development, but skip during production.

From https://wiki.c2.com/?WhatAreAssertions:

Normally assertions are turned off for builds that are released to actual customers for performance reasons.

You can disagree, but based on how Python is implemented, to disable assertions checks when optimizations are turned on, means they are designed to work the way described above.

By always enabling the debug mode of Python, it makes it impossible for developers to use assertions correctly. This change will instead allow assertions to be used correctly. Developers will always be able to use exceptions properly, and it sounds like that it what you want to be using.

Practically: I have a function that only works if it runs on the worker thread. You use assert running_on_worker_thread().

This should be:

if not running_on_worker_thread():
    raise RuntimeError('Invalid thread state')

Because you want to run this check even when optimizations are turned on, you don't want to use an assertion. Assertions are checks that should be disabled when optimizations are turned on.

denotes optimistically that you have hope that you can remove that statement at one point.

No, it denotes that you don't want to run the check when optimizations are turned on. This is evidenced by how Python is implemented.

In Sublime, you cannot control all programs written and all authors writing, so you MUST stop after a failing assert otherwise bad things can happen.

No plugins will run in Python 3.8 unless the developer tells them to. If the developer decides to use the wrong tool (assertions) and opts in the to the new environment where assertions are not checked at run time, it is not Sublime's fault.

kaste commented 4 years ago

I know the game you're playing here, but if you would read the stdlib: there is no such strict convention. You make it up. (I don't say they couldn't be such a thing, but just there isn't even in the stdlib.) Happy to see you filing bugs with "This should be:" against python core.

wbond commented 4 years ago

I’m not playing games here, and I think it is disingenuous of you to suggest so. We are using a documented, supported feature of Python 3.8.

It appears that you have philosophical disagreement with this feature of Python, but I don’t think that is a good reason for us to not make it available to other developers.

deathaxe commented 4 years ago

The stdlib just ships with assertionts for what they are ment to be - support a developer with debugging issues while developing a plugin - without compromizing production code's performance.

This is a very basic concept to add debugging support to code, just as you might want to add different levels of log messages via log.DEBUG, log.WARN or log.ERROR. No ordinary user wants to see the DEBUG messages. Maybe not the best comparison though.

kaste commented 4 years ago

It appears that you have philosophical disagreement

Here is a you against us social game, and before it was a blame game: You're doing it wrong, you should instead do...

I'm saying there is no such convention, there could have been but that ship sailed.

From the stdlib. All this went through numerous reviews, and it is also rather new code. https://github.com/python/cpython/blob/3.8/Lib/asyncio/events.py#L644-L648

    def set_event_loop(self, loop):
        """Set the event loop."""
        self._local._set_called = True
        assert loop is None or isinstance(loop, AbstractEventLoop)
        self._local._loop = loop

Clearly they meant to raise a TypeError here.

The same. https://github.com/python/cpython/blob/3.8/Lib/asyncio/events.py#L99-L105

class TimerHandle(Handle):
    """Object returned by timed callback registration methods."""

    __slots__ = ['_scheduled', '_when']

    def __init__(self, when, callback, args, loop, context=None):
        assert when is not None

This is also very lazy written.

Then https://github.com/python/cpython/blob/3.8/Lib/asyncio/subprocess.py#L166-L171

    async def _read_stream(self, fd):
        transport = self._transport.get_pipe_transport(fd)
        if fd == 2:
            stream = self.stderr
        else:
            assert fd == 1

This looks superfluous, as if just for documentation.

But then, https://github.com/python/cpython/blob/3.8/Lib/asyncio/futures.py#L318-L326

def _copy_future_state(source, dest):
    """Internal helper to copy state from another Future.
    The other Future may be a concurrent.futures.Future.
    """
    assert source.done()
    if dest.cancelled():
        return
    assert not dest.done()

Here you cannot tell if you can actually strip the asserts. There are numerous other assert statement like this where you can't possibly know if python crashes, deadlocks, or does a bad thing if you strip the assert.

So if you think that is contract programming. If you see that as very conventional usage of assert. If you're sure you can strip all these asserts without hard crashes in case of bugs.

To me it seems the advantage of this change is that it actually enables me to write code in python the way it's meant to be 🙄

wbond commented 4 years ago

Here is a you against us social game, and before it was a blame game: You're doing it wrong, you should instead do...

I am going to stop discussing the issue at this current time since it appears my comments are being interpreted as attacking you, as opposed to disagreeing with you about how CPython is designed to work.

deathaxe commented 4 years ago

All of those examples show situations which you just need during dev time.

https://github.com/python/cpython/blob/3.8/Lib/asyncio/events.py#L644-L648

... is part of the BaseDefaultEventLoopPolicy which is basically an internal factory class and needed for those who write a custom event loop only. End-user code never gets into touch with it.

...

https://github.com/python/cpython/blob/3.8/Lib/asyncio/futures.py#L318-L326

The internal method _copy_future_state is called within a closure in another interal _chain_future. The closure is actually the callback for the future to be done source.add_done_callback(_call_set_state).

Therefore the assert source.done() should never be triggerd!


That said, I am with @wbond at this point. Over and out!

deathaxe commented 4 years ago

The python 3.8 runtime works with optimizations enabled. The python 3.3 runtime keeps unchanged for compatibility reasons.

rwols commented 4 years ago

Having optimizations always enabled removes a useful tool from plugin developers. Would it hurt to provide a command-line option for subl, say --dev, that disables optimizations?

deathaxe commented 4 years ago

Also thought about such an option. I guess it's worth a dedicated issue/feature request.

wbond commented 4 years ago

I disagree it removes a useful tool, but rather enables a useful tool. If optimizations are turned off, you have to live with expensive development-only assertions from third party libs, even on end user machines.

The only thing turning optimizations on is not run assert statements. If you want to always throw an exception when a value is true, all you need is a 3 line function for your codebase:

def pre_condition(cond):
    if not cond:
        raise Exception('Precondition failed')

The function assert is this, but designed to be “optimized” away in production.

I would definitely consider a command line flag, but it is probably much more flexible for a plugin to have a “debug” setting and tie that into a custom precondition function.

rwols commented 4 years ago

The per-plugin asserts, each plugin having its own assert function, sounds like a reasonable approach, thanks @wbond.

FichteFoll commented 4 years ago

Since the demand of the issue's OP has been met, I suggest opening a new issue referencing this when there is a need to disabling optimizations on-demand for plugin developers instead of discussing it in this closed issue.