PriyaranjanMohapatra / rest-client

Automatically exported from code.google.com/p/rest-client
Apache License 2.0
0 stars 0 forks source link

Defect UTF-8 control in restclient-lib - Util.java (with code as suggestion of change) #83

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
When using the rest-client to verify large resultset, the UTF-8 control 
function can give you an error "File not in supported encoding (UTF-8)" 
even when you have a correct UTF-8 response. The problem is that the 
reader can cut a part of an UTF-8 character when handle the read in 
Util.inputStream2String(). The method doesn't handle the fact that an UTF-
8 character can be up to 10 bytes. (In west countries it does't go over 6 
bytes). 

I suggest the following changes, witch I have tested localy: 

Current part of Util.java

---

    public static String inputStream2String(final InputStream in) throws 
IOException {
        if (in == null) {
            return "";
        }
        StringBuilder out = new StringBuilder();
        byte[] b = new byte[4096];
        CharsetDecoder decoder = UTF8CHARSET.newDecoder();
        for (int n; (n = in.read(b)) != -1;) {
            CharBuffer charBuffer = null;
            try{
                charBuffer = decoder.decode(ByteBuffer.wrap(b, 0, n));
            }
            catch(MalformedInputException ex){
                throw new IOException(
                        "File not in supported encoding (" + ENCODE + ")", 
ex);
            }
            charBuffer.rewind(); // Bring the buffer's pointer to 0
            out.append(charBuffer.toString());
        }
        return out.toString();
    }

---

This is a quick fix working change to solve the problem:

---

    private static CharBuffer decodeHelper(byte[] byteArray, int 
numberOfBytes) throws IOException {
        CharsetDecoder decoder = UTF8CHARSET.newDecoder();
        CharBuffer charBuffer = null;
        try{
            charBuffer = decoder.decode(ByteBuffer.wrap(byteArray, 0, 
numberOfBytes));
        }
        catch(MalformedInputException ex){
            charBuffer = null;
        }
        return charBuffer;

    }

    public static String inputStream2String(final InputStream in) throws 
IOException {
        if (in == null) {
            return "";
        }
        StringBuilder out = new StringBuilder();
        byte[] b = new byte[4096];
        byte[] savedBytes = new byte[1];
        boolean hasSavedBytes = false;
        CharsetDecoder decoder = UTF8CHARSET.newDecoder();
        for (int n; (n = in.read(b)) != -1;) {
            if(hasSavedBytes) {
                byte[] bTmp = new byte[savedBytes.length+b.length];
                System.arraycopy(savedBytes, 0, bTmp, 0, 
savedBytes.length);
                System.arraycopy(b, 0, bTmp, savedBytes.length, b.length);
                b = bTmp;
                hasSavedBytes = false;
                n = n+savedBytes.length;
            }

            CharBuffer charBuffer = decodeHelper(b,n);
            if(charBuffer == null){
                int nrOfChars = 0;
                while (charBuffer == null){
                    nrOfChars++;
                    charBuffer = decodeHelper(b,n-nrOfChars);
                    if(nrOfChars > 10 && nrOfChars < n) {
                        try{
                            charBuffer = decoder.decode(ByteBuffer.wrap(b, 
0, n));
                        }
                        catch(MalformedInputException ex){
                            throw new IOException(
                                    "File not in supported encoding (" + 
ENCODE + ")", ex);
                        }
                    }
                }
                savedBytes = new byte[nrOfChars];
                hasSavedBytes = true;
                for(int i=0;i<nrOfChars;i++){
                    savedBytes[i] = b[n-nrOfChars+i];
                }
            }

            charBuffer.rewind(); // Bring the buffer's pointer to 0
            out.append(charBuffer.toString());
        }
        if(hasSavedBytes) {
            try{
                CharBuffer charBuffer = decoder.decode(ByteBuffer.wrap
(savedBytes, 0, savedBytes.length));
            }
            catch(MalformedInputException ex){
                throw new IOException(
                        "File not in supported encoding (" + ENCODE + ")", 
ex);
            }
        }
        return out.toString();
    }

---

Original issue reported on code.google.com by i...@perres.se on 5 Mar 2009 at 3:23

GoogleCodeExporter commented 8 years ago
Thanks for the recommendation, I will do some basic verification and add it.

Original comment by subwiz on 6 Mar 2009 at 11:17

GoogleCodeExporter commented 8 years ago
I did some basic verification,maven test cases are working successfully.The 
codes are
modified and committed in the revision 431.

Original comment by velraja...@gmail.com on 7 Mar 2009 at 4:34

GoogleCodeExporter commented 8 years ago
@velrajan.r Thanks for fixing it :-)

Original comment by subwiz on 8 Mar 2009 at 4:22