Open vfazio opened 7 months ago
Note that this seems to be an issue because the Formatter object is a shared instance across the IO objects, in Poetry's case StreamOutput and child SectionOutput objects (see StreamOutput.section
)
One "workaround" would be to duplicate the Formatter
object instead of sharing the object. That way the StreamOutput
object could query its Formatter object independent of what the SectionOutput
objects are using. This would allow the sections to manipulate their formatters independently and existing code may "just work" without having to make:
Formatter.remove_format
-> Formatter.format
-> Formatter.format_and_wrap
-> Formatter._apply_current_style
aware that while the formatter supports decorating, this operation should not be decorated.
If that's not ideal because formatter styles need to be kept in sync between the parent output object and child sections, maybe leverage a different private instance flag:
def remove_format(self, text: str) -> str:
self._suppress_decoration = True
text = re.sub(r"\033\[[^m]*m", "", self.format(text))
self._suppress_decoration = False
return text
def _apply_current_style(
self, text: str, current: str, width: int, current_line_length: int
) -> tuple[str, int]:
if not text:
return "", current_line_length
if not width:
if self.is_decorated() and not self._suppress_decoration:
return self._style_stack.current.apply(text), current_line_length
return text, current_line_length
</snip>
if self.is_decorated() and not self._suppress_decoration:
apply = self._style_stack.current.apply
text = "\n".join(map(apply, lines))
return text, current_line_length
Use contextvars and your problems are gone. They are per Thread. You may want to use the scope feature of context vars:
contextvar = ContextVar(..)
...
token = contextvar.set(1)
try:
...
finally:
contextvar.reset(token)
the property can be something like:
@property
def is_decorated(self):
return contextvar.get()
Thank you @devkral. Would you have a minute to make a PR?
Not saying this isn't the right fix, but one of the questions I posited was whether this library was intended to be thread-safe. If it is, this fixes a single instance where there was a thread conflict, but there may be others. If not, then its on the caller to use the library as intended.
@vfazio I'd want it to be thread-safe (at the very least, UI/terminal printing should be), since Poetry uses Cleo in threaded jobs. I don't know of an easy way to check if the code is thread-safe, but if you know, I would be happy to hear about it.
Im on holiday atm but saw these emails come through.
I don't think theres an "easy" way, but the reality is that if any public attribute/property can be changed on an object, its open to a race condition without some critical section. If these classes were frozen, once created the attribute values didn't change, itd be less of an issue.
Thats the crux of this problem if i recall correctly, an internal function was flipping the value of a publicly queryable property which caused a race.
@vfazio I'd want it to be thread-safe (at the very least, UI/terminal printing should be), since Poetry uses Cleo in threaded jobs. I don't know of an easy way to check if the code is thread-safe, but if you know, I would be happy to hear about it.
I just realize i dropped the ball on actually replying to this until i just reviewed the poetry repo for updates.
Poetry already takes responsibility for locking when updating sections, I'm pretty sure that's what this lock is used for
So I don't think Poetry expects cleo to be thread safe in general.
The central issue is that Poetry expected an Output
object's is_decorated
property to be frozen after the class has been instantiated. That would seem mostly reasonable given how the class is declared. However, that's now how the property works since code internal to the formatter flips this value when updating sections.
If Poetry had been made aware that this was not a thread-safe method, they could have just acquired the lock before reading the property:
with self._lock:
decorated = self.supports_fancy_output()
My PR avoided the lock and just cached the value, since we expect that the final output will be decorated based on how the Output
object was constructed. Regardless of the current state of the formatter as sections get written out, we know that we can count on the output being formatted when finally displayed.
https://github.com/python-poetry/poetry/pull/9335
I hopefully didn't give the impression that I thought this library needs to be thread-safe, only that a decision and some communication should be made about the intentions of the library.
If it's not intended to be thread safe, that's 100% ok IMO and consumers need to realize that they're on their own for solving issues that come up.
However, if there is real desire to make this thread safe and to potentially avoid consumers from having to declare their own critical sections, that's a much larger task.
For the formatter, i threw out a couple of bad ideas. Another bad idea is to just use a lock in any method that references to self._decorated
to synchronize access while it's potentially being changed.
However, that's just one property and thread safety can be a game of whack-a-mole. Public methods like format
and format_and_wrap
which utilize an internal stack are likely to also get corrupted if multiple threads have access to the formatter and aren't synchronized.
Let's put it this way: if it can be made thread-safe easily, why not do it? In this case the fix seems quite easy to do, so it will be fixed in the next version
See https://github.com/python-poetry/poetry/issues/9334
Poetry relies on the formatter's
is_decorated
property to be consistent when multiple threads query the property.If one thread is in the middle of updating a section and has changed the formatter's
is_decorated
property viaremove_format
, a second thread may query it and get back an incorrect result.If cleo is not thread-safe, the developers of poetry should be made aware, otherwise there may need to be changes to how
remove_format
andformat
work in relation to whether decorated output is supported in general vs decorated output is being emitted for the current format operation.