rburns / ansi-to-html

Convert ansi escaped text streams to html.
MIT License
357 stars 48 forks source link

Failure on \r sequences #82

Closed abitrolly closed 3 years ago

abitrolly commented 4 years ago

Why ansi-to-html fails to process the text if it encounters \r sequence?

'\x1b[94mANSI \x1b[39;49mHello \r \x1b[96mWorld\x1b[0m' - this produces only image.

While replacing \r with '\n' prints the rest of the text - 'World`.

image

I expected the converter to ignore unknown chars and continue rendering.

rburns commented 4 years ago

it looks like ignoring the rest of the string is a bug. it should continue processing, and discard things it doesn't understand. I'd guess it came about due to changes for making the two tests here pass https://github.com/rburns/ansi-to-html/blob/master/test/ansi_to_html.js#L38

related pull requests: https://github.com/rburns/ansi-to-html/pull/64 https://github.com/rburns/ansi-to-html/pull/62

mkilpatrick commented 3 years ago

Is there any plan to fix this?

rburns commented 3 years ago

I'd like to fix it. I made a branch here with a failing test. https://github.com/rburns/ansi-to-html/compare/bugfix/82-failure-slash-r-sequence as a start. it's aimed at the minimal solution, ignore the \r sequence. possibly might find a way to treat it as a new line.

vboulaye commented 3 years ago

Hi, I encountered the same issue while trying to parse gitlab-ci traces. I tried to fix it on top of your branch by adding a token definition \r => remove. This seems to work (and the test passes).

However I also think replacing it by a new line would be better and using newLine instead of removeseems to work.

Would you like me to addapt the test and prepare a PR with this solution ?

rburns commented 3 years ago

@vboulaye I agree a new line would be better. A PR would be great. thanks

rburns commented 3 years ago

@mkilpatrick this has been fixed on master. https://github.com/rburns/ansi-to-html/commit/79ef237db934b812d19a749967453a9daff2639b thanks @vboulaye. I'll make a release soon, but if you have a chance to do test before then, any feedback would be useful.

abitrolly commented 3 years ago

Still not fixed with 0.7.1 https://observablehq.com/@abitrolly/ansi-text-viewer

image

abitrolly commented 3 years ago

For comparison Linux terminal.

image

image

rburns commented 3 years ago

@abitrolly I replicated your observableqh tests over here https://github.com/rburns/ansi-to-html/pull/95/files to make sure I hadn't missed something and that \r was processed consistently in the context of other escape sequences. They look alright to me. But, then wondered why there was still erroneous output at the observablehq link. turns out that while version 0.7.1 was referenced elsewhere in the document. the ansitohtml() definition used version 0.6.14. I've modified that in the screenshot below.

2021-07-21-205447_1278x143_scrot

regarding the linux terminal output, I don't think it's reasonably possible to place input following \r at the beginning of the current line as a terminal would. for reasons discussed in this thread https://github.com/rburns/ansi-to-html/issues/14. it's difficult to rewrite past output. I think either one of ignoring the '\r', or treating it as a new line (as it does now) is a reasonable choice.

truncating the output as it was doing previously might have the benefit that things like progress bars don't flood your screen. I don't recall if that behaviour was intentional.

mkilpatrick commented 3 years ago

The change looks good from my testing.

abitrolly commented 3 years ago

@rburns good catch! I removed extra import and now that the 0.7.1 is imported correctly, it gives the acceptable result.

image

Sorry for the false alarm.

I don't think that parity with Linux terminal is needed here. At least it could be made an option if this behavior is needed for some ANSI graphics or 1:1 reproduction is build logs. Even for build logs preserving content is more desirable than overwriting this.