Closed vfazio closed 6 months ago
update cleo to not flip the
_decorated
private variable when formatting messages since there's a public interface for that value
seems like the correct fix
update cleo to not flip the
_decorated
private variable when formatting messages since there's a public interface for that valueseems like the correct fix
It does seem like a better long term fix, but cleo looks to be in the middle of a large update, which would require either a patch release and a dependency bump for poetry or waiting until the next release with the fix.
The code change to accommodate this is also not "trivial" as there are about 4 function calls that would need to be made "non-decorated" aware. Cleo doesn't advertise being thread-safe afaik but i didn't read too deeply. The concurrent use by poetry does seem "special"
Poetry makes the assumption that this value should never change in the middle of performing a list of operations. Modifying that behavior to handle dynamic changing seems overkill in this scenario.
Option 1 (caching the value on init) is the least painful, obviates waiting for cleo and a dependency bump, and also is a bit more performant in terms of querying the property.
that mostly is a discussion you ought to have with the cleo maintainers, who hopefully will have opinions on whether it is intended to be thread-safe, whether a fix is desirable, whether now is a good time, etc - suggest raising an issue over there and starting that conversation
I'll write the issue, but i'm curious why patching poetry in the interim is not an option
I have no particular view (or say) on whether patching poetry in the interim is an option. I am saying that it seems to me that the correct fix is in cleo and so you should explore that first
I opened an issue so at least a conversation can be had and documented. I should have done that first, thanks @dimbleby
My two cents: I do not expect a quick cleo release with a fix and think we can include #9335 in Poetry 1.8.3. (I'm just waiting for #9341.)
Thanks for the detailed analysis @vfazio 👍
I will take a look at it on the Cleo side, but in the mean time it could be fixed on Poetry side.
My two cents: I do not expect a quick cleo release with a fix and think we can include #9335 in Poetry 1.8.3. (I'm just waiting for #9341.)
Thanks for the detailed analysis @vfazio 👍
@radoering @Secrus out of curiosity, is there a "timeline" for poetry 1.8.3? I'm currently planning to migrate to 1.8.2 but if 1.8.3 is on the horizon, I'll hold off if it will include this fix.
There is no timeline, but since the backports workflow is working again, I assume within the next few days.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Description
There is a subtle and difficult to reproduce race condition in poetry when using decorated output (default) and parallel installs (default) which leads to poetry randomly failing to install dependencies without displaying the cause of the error.
The gist is this:
Multiple install tasks are queued up:
When
self._execute_operation
runs we have branching logic:If fancy output is not supported, a line is written to the output, otherwise a new section is added to our dictionary of sections and then we try to perform the operation.
The operation runs via _do_execute_operation -> execute{op}
The operation prints out a message:
and _write will try to update the section if decorated output is enabled:
The problem, however, is this:
However,
SectionOutput._write
->add_content
->remove_format
causes the formatter, which is shared by the IO object and the sections, to flip_decorated
to False_execute_operation
and seessupports_fancy_output
is False, so prints out to console and does not add a section to theExecutor._sections
dictionary_decorated
to True_do_execute_operation
which calls intoExecutor._write
. This seessupports_fancy_output
is True so assumes there should be a section for this operation. The section does not exist so a KeyError is thrown.Executor._execute_operation
tries to handle the KeyError and sees thatsupports_fancy_output
is True, so calls back intoExecutor._write
which throws yet another KeyError that's raised from within the Exception handler so no output is displayed.This is what the splat may have looked like if we checked for
isinstance(e, KeyError)
and didn't try to update the section we thought existed (note i had debug code in place at the time of this)There are probably a number of ways to fix this issue:
1) cache
self._io.output.is_decorated()
at the time the Executor is init'd. 2) update cleo to not flip the _decorated private variable when formatting messages since there's a public interface for that value 3) update cleo to wrap that private variable in some sort of locking mechanism thatremove_format
andis_decorated
respect 4) updateExecutor._write
to check if the operation is in the section dictionary in a more graceful way 5) updateExecutor.supports_fancy_output
to leverageself._lock
before querying the backing IO object's formatter. Since sections get updated under this lock, the formatter's _decorated value should be restored by the time the lock is released, however this forces even more things to serialize over this lock which is not necessaryI'll make a PR for 1, since it's a minimal amount of code changes, doesn't complicate any of the section logic any more than is necessary, doesn't require buttressing any of the exception handling logic, and doesn't add yet another synchronization point over
Executor._lock
. It should allow the value to be stable over time even if cleo flips the formatter decoration off temporarily. I ran this fix in a loop over the course 24 hours and did not see any issues whereas without the fix reproduction would occur within 5-10 minutes.Note that this can be very difficult to reproduce. I have had luck reproducing on slow hardware under an emulator for a foreign architecture making it even slower so the time window where the formatter decorator is false is large enough for another thread to hit the issue (this is how I initially discovered it in one of our CI pipelines) but short enough for the other thread to see the value flip back to true when trying to print a progress message.
Adding a sleep in cleo/formatters/formatter.py::Formatter.remove_format or a series of events/signals may be easier
Workarounds
Run poetry with
NO_COLOR=1
defined in the environment Run poetry with the--no-ansi
argument on the command line Run poetry with the-q
argument on the command linePoetry Installation Method
install.python-poetry.org
Operating System
Debian 12
Poetry Version
Poetry (version 1.8.2)
Poetry Configuration
Python Sysconfig
No response
Example pyproject.toml
Poetry Runtime Logs