alan-turing-institute / data-safe-haven

https://data-safe-haven.readthedocs.io
BSD 3-Clause "New" or "Revised" License
57 stars 15 forks source link

Fix pulumi colours #2113

Closed jemrobinson closed 1 month ago

jemrobinson commented 1 month ago

:white_check_mark: Checklist

:vertical_traffic_light: Depends on

:arrow_heading_up: Summary

IMPORTANT: I am confident that merging this will not affect the findings of the ongoing penetration test

Fix Pulumi colours

Before

Screenshot 2024-08-08 at 16 06 05

After

Screenshot 2024-08-08 at 16 17 30

:closed_umbrella: Related issues

Closes #2112

:microscope: Tests

Tested by rerunning against an existing SRE.

github-actions[bot] commented 1 month ago

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  data_safe_haven/infrastructure
  project_manager.py 82
Project Total  

This report was generated by python-coverage-comment-action

jemrobinson commented 1 month ago

The current process looks like the following:

  1. The raw output from Pulumi logging has ANSI colour codes embedded in it (e.g. "\x1b[38;5;13m\x1b[1mDuration\x1b[0m 45s")
  2. We then pass this string to ProjectManager::log_message()
  3. This passes the string on to logging::from_ansi()
  4. Inside this function we convert the coloured string to a rich.Text object (e.g. <text 'Duration 45s' [Span(0, 8, Style(color=Color('color(13)', ColorType.STANDARD, number=13), bold=True))]>)
  5. This is then passed to logging.logger.info
  6. At some point before this object is passed to the rich.Handler it's converted from a rich.Text object to a string, dropping the formatting along the way
  7. The rich.Handler then prints the non-coloured string to the console

The change in this PR does the following:

  1. The raw output from Pulumi logging has ANSI colour codes embedded in it (e.g. "\x1b[38;5;13m\x1b[1mDuration\x1b[0m 45s")
  2. We pass this directly to logging.logger.info
  3. No string conversion needs to take place since this is already just a string
  4. The rich.Handler then prints the coloured string to the console

TL;DR

There's no problem with how rich is handling the string conversion - it's just that logging doesn't want to pass rich.Text objects around internally.

The other option I considered was to convert the ANSI markers to rich markup that would pass through the logging internals equally well, but there doesn't seem to be much point in converting from ANSI just to convert straight back again.

jemrobinson commented 1 month ago

Note on the strip_ansi() function:

It's because you quite often get Pulumi output that looks like this:

"\033[31;merror:\033[0m Code='AnErrorCode'"

Screenshot 2024-08-09 at 10 48 39

which we'd like to log as "Pulumi output: Code='AnErrorCode'".

If we try to do error_line.replace("error:", "").strip() then the first whitespace character is masked by the ANSI colour codes so you get:

"Pulumi output: \033[31;m\033[0m Code='AnErrorCode'".

which looks like

Screenshot 2024-08-09 at 10 48 56

and effectively a double-space in the output.

If we can strip the ANSI codes first, then the strip() call will work. There are other ways to strip ANSI codes but they mainly involve complicated regexes. If you'd prefer to do this then we can.

I'll move this commit to the other PR though.

JimMadge commented 1 month ago

@jemrobinson Could you check if https://github.com/alan-turing-institute/data-safe-haven/tree/pulumi_colour does the same? It seems to work for me.

I can't quite remember the thinking behind removing the colour, I think it was more about removing the ANSI codes. The logging handler doesn't seem to work with rich Text (which is odd given it is a rich handler) but you can convert the Text object to a string with rich markup like we use for other log messages.

jemrobinson commented 1 month ago

Yes it works, I've already tested this. I don't really see the benefit of converting ANSI-text to markup if it's not causing problems though: see my note above: The other option I considered was to convert the ANSI markers to rich markup that would pass through the logging internals equally well, but there doesn't seem to be much point in converting from ANSI just to convert straight back again.

I don't think we need the log_message or from_ansi functions at all.

NB. The logging handler does work with rich.Text objects, but the Python logging library converts these to str before they make it as far as RichHandler (I tested this by adding debugging print statements throughout the logging and Rich libraries).

JimMadge commented 1 month ago

Before the ANSI codes were printed.

If that's not true now, I'm not sure what has changed but we wouldn't need the extra handling. Otherwise, I think converting from ANSI markup to rich markup is most sensible because that is what we use elsewhere.

jemrobinson commented 1 month ago

I don't see printed ANSI codes with this change, and I'm not sure why this was happening in #1952. Note that in 5.0.0-rc1 we also passed Pulumi output straight to the logger (see below) and this was printed correctly.

https://github.com/alan-turing-institute/data-safe-haven/blob/afb29b42510af9fa261b891348fb08d8baddca2b/data_safe_haven/infrastructure/stack_manager.py#L310-L318

If you think there are other benefits to converting to rich markup, I'm happy to restore a helper function that does so.

JimMadge commented 1 month ago

Do we need the changes here then? The strip_ansi stuff feels like a lot of complexity to remove a space.

I feel like that is more likely to cause problems than just pre-pending the full Pulumi error with "Pulumi error:", replacing anything.

jemrobinson commented 1 month ago

Do we need the changes here then? The strip_ansi stuff feels like a lot of complexity to remove a space.

  1. We do need the changes that replace the from_ansi function (or something equivalent)
  2. We don't need the strip_ansi changes - the change you suggest would also remove the need for us to try to guess the structure of a Pulumi error message. It might look a bit like this:
Screenshot 2024-08-09 at 12 38 01
jemrobinson commented 1 month ago

strip_ansi() removed in 51a0ff935. Are you happy to merge @JimMadge ?