ibondre / ude

Automatically exported from code.google.com/p/ude
Other
0 stars 0 forks source link

Detection fails on particular, simple ANSI file #7

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Save an ANSI file containing the text "CONFIG: main 30000000"
2. Run the library and/or exe on it

What is the expected output? What do you see instead?

I expect ANSI detected.

What version of the product are you using? On what operating system?

The library shows null for charset, and the exe shows "detection failed".

Please provide any additional information below.

I don't know if this is how the library is intended to work, but I think it 
would be more useful to detect ANSI if all the characters fit into ANSI. Or at 
least support this behavior optionally.

Original issue reported on code.google.com by john.bu...@gmail.com on 14 Sep 2014 at 4:59

GoogleCodeExporter commented 9 years ago
There is a bug in UniversalDetector.cs (around line 152):
       } else { 
           if (inputState == InputState.PureASCII &&
               (buf[i] == 0x33 || (buf[i] == 0x7B && lastChar == 0x7E))) {
                ^^^^^^^^^^^^^^^^
                ESC = 27 (decimal) = 33(octal)
                0x33 = 51 (decimal) = "3" (ASCII)

                // found escape character or HZ "~{"

Original comment by mkk...@gmail.com on 5 Dec 2014 at 2:46