objectionary / eo-phi-normalizer

Command Line Normalizer and Rewriter of 𝜑-calculus Expressions (part of EOLANG family)
https://www.objectionary.com/eo-phi-normalizer/
MIT License
7 stars 2 forks source link

the output is not escaped on error #572

Open yegor256 opened 3 days ago

yegor256 commented 3 days ago

I see this in the log:

$ eo-phi-normalizer rewrite --rules 0.yml Foo.phi --single -o Bar.phi
eo-phi-normalizer: syntax error at line 1, column 1 due to lexer error
on input
?.org.eolang.bytes ( ?0 ? ? ? ? 00-00-00-00-00-00-1E-61 ? )

Here, I don't understand whether the problem is with the encoding or the input was indeed formatted as ?0 instead of α0. I suggest you to "escape" non-ASCII symbols in the output. Instead of printing UTF-8 as is, convert them to something like \u045e.

Maybe you can say on input (non-ASCII symbols escaped) instead of just on input.

yegor256 commented 3 days ago

@deemp please, help

deemp commented 3 days ago

@yegor256, run export LC_ALL=C.UTF-8 before running this command.

yegor256 commented 3 days ago

@deemp yes, we know the workaround, but please make the output escaped :)

deemp commented 3 days ago

@yegor256,

  1. Does normalizer render Unicode correctly in error messages with export ...?
  2. Does normalizer render Unicode correctly in normal output without export ...? If it doesn't, then export ... is not a workaround, but a necessity. We can write it explicitly on command pages on the docs site.
yegor256 commented 3 days ago

@deemp yes, it works with the export, but I kindly ask you to implement this escaping feature because it will help users debug much faster

deemp commented 2 days ago

@yegor256, can you suggest how to distinguish when to print Unicode and when to escape?

I thought about:

yegor256 commented 2 days ago

@deemp just escape always, when you print this error message. Why not to escape? It's an error message, it won't be parsed by any software, it will always be read by humans. Replace all 0x7f+ symbols with their mnemos, that's it.

deemp commented 2 days ago

@yegor256, it's inconvenient to read numbers when you can read Unicode characters. If the locale is set correctly, users may prefer to see Unicode.

yegor256 commented 2 days ago

@deemp I'm the primary user of this app :) I'm telling you, as a user, that error messages must be as non-ambiguous as possible. Unicode is more ambiguous than ASCII.

deemp commented 2 days ago

I'm the primary user of this app

@yegor256, OK, I'll keep that in mind :) Let's escape.

deemp commented 2 days ago

@yegor256, here are representations of errors.

  1. With escaping:

    syntax error at line 1, column 1 before `\961'
    on input
    \961 \8614 \10214 t \8614 \958.\961.k.\961.t
  2. With correctly set locale and without escaping:

    syntax error at line 1, column 1 before `ρ'
    on input
    ρ ↦ ⟦ t ↦ ξ.ρ.k.ρ.t

Do you really prefer the option with escaping?

yegor256 commented 2 days ago

@deemp can you do both? show the original one and then print the escaped one?

deemp commented 2 days ago

the original one

@yegor256, which one do you mean?

yegor256 commented 2 days ago

@deemp how many do you have? :) print them both

deemp commented 2 days ago

@yegor256, see https://github.com/objectionary/eo-phi-normalizer/issues/572#issuecomment-2508009676

yegor256 commented 1 day ago

@deemp please, print both outputs in case of error: 1) not escaped, and 2) escaped