Unidata / thredds

THREDDS Data Server v4.6
https://www.unidata.ucar.edu/software/tds/v4.6/index.html
265 stars 179 forks source link

NCStream endianness incorrect #442

Open dopplershift opened 8 years ago

dopplershift commented 8 years ago

I don't think we're encoding endianness correctly on ncstream. (For the following, all lines are taken from 4.6). For starters, we say (NcStreamWriter.java:105):

    ByteOrder bo = ByteOrder.nativeOrder(); // reader makes right

and when we write the protos the wire this becomes (NcStream.java:225):

    builder.setBigend(bo == ByteOrder.BIG_ENDIAN);

So running on my machine, NcStream sends bigend: false (in the protobuf) across the wire.

However, writing the data ends up going down (for a long int) to (DataOutputStream.java:215, a jdk class):

    /**
     * Writes a <code>long</code> to the underlying output stream as eight
     * bytes, high byte first. In no exception is thrown, the counter
     * <code>written</code> is incremented by <code>8</code>.
     *
     * @param      v   a <code>long</code> to be written.
     * @exception  IOException  if an I/O error occurs.
     * @see        java.io.FilterOutputStream#out
     */
    public final void writeLong(long v) throws IOException {
        writeBuffer[0] = (byte)(v >>> 56);
        writeBuffer[1] = (byte)(v >>> 48);
        writeBuffer[2] = (byte)(v >>> 40);
        writeBuffer[3] = (byte)(v >>> 32);
        writeBuffer[4] = (byte)(v >>> 24);
        writeBuffer[5] = (byte)(v >>> 16);
        writeBuffer[6] = (byte)(v >>>  8);
        writeBuffer[7] = (byte)(v >>>  0);
        out.write(writeBuffer, 0, 8);
        incCount(8);
    }

That's not native encoding; it's also not little endian, which is what the bigend: false would have me believe. So, do we fix the code at least to properly encode bigend: true (which would always be true, at least for the current java implementation), or do we redefine the spec to always be big endian, and say to ignore the flag?

dopplershift commented 8 years ago

attn @JohnLCaron

dopplershift commented 8 years ago

And for reference, the first 32 bytes across the wire for an int64 variable with [0, 1, 2, 3] is:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3
dopplershift commented 8 years ago

It looks like this only applies to simple data variables. The StructureData byte stream is coming back little endian.

JohnLCaron commented 8 years ago

Im wondering if this is the problem:

    public DataOutputStream setupStream(OutputStream out, int size)
            throws IOException
    {
        switch (type) {
            // For compression (currently deflate) we compress the data, then
            // will write the block size, and then data, when the stream is closed.
            case DEFLATE:
                // limit level to range [-1, 9], where -1 is default deflate setting.
                int level = Math.min(Math.max((Integer)compressInfo, -1), 9);
                int bufferSize = Math.min(size / 2, 512 * 1024 * 1024);
                return new NcStreamCompressedOutputStream(out, bufferSize, level);

            default:
                System.out.printf(" Unknown compression type %s. Defaulting to none.%n", type);

            // In the case of no compression, go ahead and write the block
            // size so that the stream is ready for data
            case NONE:
                DataOutputStream dos = new DataOutputStream(out);
                NcStream.writeVInt(dos, size);
                return dos;
        }
    }

should be

    public OutputStream setupStream(OutputStream out, int size)
            throws IOException
    {
        switch (type) {
            // For compression (currently deflate) we compress the data, then
            // will write the block size, and then data, when the stream is closed.
            case DEFLATE:
                // limit level to range [-1, 9], where -1 is default deflate setting.
                int level = Math.min(Math.max((Integer)compressInfo, -1), 9);
                int bufferSize = Math.min(size / 2, 512 * 1024 * 1024);
                return new NcStreamCompressedOutputStream(out, bufferSize, level);

            default:
                System.out.printf(" Unknown compression type %s. Defaulting to none.%n", type);

            // In the case of no compression, go ahead and write the block
            // size so that the stream is ready for data
            case NONE:
               NcStream.writeVInt(out, size);
                return out;
        }
    }

and

NcStreamCompressedOutputStream extends OutputStream ??

dopplershift commented 8 years ago

Close--and you had me believing it. In fact, I was able to eliminate the use of DataOutputStream in NcStreamCompression; but it's still broken. The real culprit is IospHelper.copyToOutputStream() (IospHelper.java:499):

  public static long copyToOutputStream(Array data, OutputStream out) throws java.io.IOException {
    Class classType = data.getElementType();

    DataOutputStream dataOut;
    if (out instanceof DataOutputStream)
      dataOut = (DataOutputStream) out;
    else
      dataOut = new DataOutputStream(out);
...
JohnLCaron commented 8 years ago

yep, but weird cdmremote has worked up to now.

On Wed, Feb 24, 2016 at 1:21 PM, Ryan May notifications@github.com wrote:

Close--and you had me believing it. In fact, I was able to eliminate the use of DataOutputStream in NcStreamCompression; but it's still broken. The real culprit is IospHelper.copyToOutputStream() (IospHelper.java:499):

public static long copyToOutputStream(Array data, OutputStream out) throws java.io.IOException { Class classType = data.getElementType();

DataOutputStream dataOut;
if (out instanceof DataOutputStream)
  dataOut = (DataOutputStream) out;
else
  dataOut = new DataOutputStream(out);...

— Reply to this email directly or view it on GitHub https://github.com/Unidata/thredds/issues/442#issuecomment-188439527.

dopplershift commented 8 years ago

Could it be using DataInputStream when deserializing? :grin:

JohnLCaron commented 8 years ago

well, at least i was right that we needed another implementation to tease out the bugs....

On Wed, Feb 24, 2016 at 2:09 PM, Ryan May notifications@github.com wrote:

Could it be using DataInputStream when deserializing? [image: :grin:]

— Reply to this email directly or view it on GitHub https://github.com/Unidata/thredds/issues/442#issuecomment-188453725.

JohnLCaron commented 8 years ago

Did this get fixed?

dopplershift commented 8 years ago

No, it wasn't clear to me whether modifying copyToOutputStream (to remove the force of using DataOutputStream) was the correct fix.