Closed GoogleCodeExporter closed 9 years ago
ESC is '\033' (that's OCTAL) 3 * 8 + 3 == 29
or '\x1B' ... 1 * 16 + 11 == 29
Original comment by sjmac...@lexicon.net
on 31 Mar 2010 at 1:35
[deleted comment]
[deleted comment]
[deleted comment]
@sjmac...@lexicon.net
Your comments are a bit unclear to me. First of all octal 33 = 27 dec.
(3 *8+3 / 1*16 + 11 = also 27)
Second, it this now a bug or not?
With some tests is also see that it goes wrong with the number 3.
I have a file with only has the number 3 and it fails the detection. With the
number
4 it works.
So there is something wrong. At this code the 'non-ascii' value is detected
(UniversalDetector.cs:151-155)
if (inputState == InputState.PureASCII &&
(buf[i] == 0x33 || (buf[i] == 0x7B && lastChar == 0x7E))) {
// found escape character or HZ "~{"
inputState = InputState.EscASCII;
}
The buffer is tested with 33-hex and not 33-octal. The dec-number 3 is 51dec
which is
33-hex.
So it should be buf[i] == 0x1b
Bottomline. Code should be changed to:
if (inputState == InputState.PureASCII &&
(buf[i] == 0x1B || (buf[i] == 0x7B && lastChar == 0x7E))) {
// found escape character (hex 1B) or HZ "~{"
// JV: fix. Was buf[i] == 0x33, which is number 3
inputState = InputState.EscASCII;
}
Original comment by j.verdur...@2bmore.nl
on 21 Apr 2010 at 11:02
Attachments:
Yes, it is clearly a typo. The escape character is 0x1B. I'll patch it as soon
as I'm
up and running, hopefully next week.
Thanks!
Original comment by rudi.pet...@gmail.com
on 22 Apr 2010 at 5:26
Committed. Thanks.
Original comment by rudi.pet...@gmail.com
on 14 May 2010 at 11:18
Original issue reported on code.google.com by
rbhatt%c...@gtempaccount.com
on 2 Dec 2009 at 5:29