support for utf-8 encoding when perform "open definition" operation.

GoClipse / goclipse

Eclipse IDE for the Go programming language:

http://goclipse.github.io/

Eclipse Public License 1.0

840 stars 287 forks source link

support for utf-8 encoding when perform "open definition" operation. #133

Closed yongseoklee closed 9 years ago

yongseoklee commented 9 years ago

Eclipse PDE calculates offset by counting characters, while go oracle uses pos flags as byte offsets.

So the offset should be adjusted when handling documents with utf-8 characters.

see: https://github.com/GoClipse/goclipse/issues/132

bruno-medeiros commented 9 years ago

Hum, you're right. gocode works with utf offsets, but I just double-checked in the Oracle documentation (http://golang.org/s/oracle-user-manual) and the position argument is indeed byte-offset :/

bruno-medeiros commented 9 years ago

I'm merging this in, but adding some changes on top:

First this assumes the encoding is UTF-8, that's incorrect. It can be other, so that has to be determined.
Seems like a simple call to encode() like this:

            CharBuffer src = CharBuffer.wrap(source, 0, charOffset);
            return encoder.encode(src).limit();

is enough to determine the byte offset, no need to mess with the ByteBuffer, flush(), etc, ourselves

bruno-medeiros commented 9 years ago

Hum, actually, seems like the Go toolchain only accepts the UTF-8 encoding? So it wasn't much of an issue to try to handle other encodings.

yongseoklee commented 9 years ago

Yes. Go toolchain only accepts the UTF-8 encoding. See the Go language specification. The first statement of the "Source code representation" chapter says: "Source code is Unicode text encoded in UTF-8."

https://golang.org/ref/spec#Source_code_representation