zeromq / clrzmq4

ZeroMQ C# namespace (.NET and mono, Windows, Linux and MacOSX, x86 and amd64)
GNU Lesser General Public License v3.0
241 stars 112 forks source link

unicode string decoded error #56

Closed goldenbull closed 7 years ago

goldenbull commented 8 years ago

when reading a unicode string from recieved message, some data is lost. this piece code will show the problem:

private static void Main(string[] args)
{
    var push = new ZSocket(ZContext.Current, ZSocketType.PUSH);
    push.Bind("tcp://127.0.0.1:12345");
    var pull = new ZSocket(ZContext.Current, ZSocketType.PULL);
    pull.Connect("tcp://127.0.0.1:12345");

    var str = "中文字符串";
    Console.WriteLine("str send: {0}", str);
    using (var msg = new ZMessage(new[] {new ZFrame(str),}))
        push.SendMessage(msg);
    using (var msg = pull.ReceiveMessage())
        Console.WriteLine("str recv: {0}", msg[0].ReadString());
}

output:
str send: 中文字符串
str recv: 中?
goldenbull commented 8 years ago

ZFrame is a Stream, string reading/writing functions are in StreamReader/StreamWriter class, so I think maybe it's not necessary to encode/decode string byte-by-byte again.

metadings commented 8 years ago

Awesome, you are actually using Unicode! 16bit or 32bit?

You should use

using System;
using System.Text;
using ZeroMQ;

private static void Main(string[] args)
{
    var push = new ZSocket(ZSocketType.PUSH);
    push.Bind("tcp://127.0.0.1:12345");
    var pull = new ZSocket(ZSocketType.PULL);
    pull.Connect("tcp://127.0.0.1:12345");

    string str = "中文字符串";
    Console.WriteLine("str send: {0}", str);
    push.SendFrame(new ZFrame(str, Encoding.Unicode));

    using (ZFrame frame = pull.ReceiveFrame())
        Console.WriteLine("str recv: {0}", frame.ReadString(Encoding.Unicode));
}

This is because ZContext.Encoding is Encoding.UTF8... Please, try this one!

The problem with StreamReader/StreamWriter classes is, they are closing the stream after their use - and I don't want a ZFrame being closed before I do so...

goldenbull commented 8 years ago

Encoding.Unicode is fine. StreamReader and StreamWriter constructor has a leaveOpen parameter since .net 4.5, maybe that will help: https://msdn.microsoft.com/en-us/library/gg712853(v=vs.110).aspx

metadings commented 8 years ago

".net 4.5" - I don't want to say they could have done this in ".net 2.0"

goldenbull commented 8 years ago

For .net 2.0, we can derive a custom class from StreamReader/StreamWriter and override the Dispose method not to close underlying stream.

metadings commented 8 years ago

Yes, this is what ZFrame is basically all about - Dismiss, Dispose and not going to close the stream, when someone is reading from or writing on it.

I also thought about a ZFrameReader and ZFrameWriter. But I don't like the creation of additional objects, just to read and to write. This is why I made these functions directly into the stream, the ZFrame.

goldenbull commented 8 years ago

Agree. Still I think it's better if we can use StreamReader/Writer with some simple work-around, and not to re-invent the wheels :smile:

metadings commented 8 years ago

However, you can do so already... You just need someone initializing your ZFrame on the length of the string.

goldenbull commented 8 years ago

In fact I think we do not need to support net 2.0 any more

metadings commented 8 years ago

Not .NET 2.0, but I do like .NET 4.0 because it runs on WinXP. .NET 4.5 doesn't.

goldenbull commented 8 years ago

So what's your plan of this bug? If users are required to use Encoding.Unicode, the API should not have an encoding parameter.

goldenbull commented 8 years ago

I can try to make a PR of a custom derived StreamReader/Writer if you are OK with this kind of work-around

metadings commented 8 years ago

This is actually a really hard PR, because you need to write a new ZFrame constructor...

The Encoding is there to actually read/write a string on a ZFrame. This may be removed, if you have a ZFrameReader and ZFrameWriter.

You may try...

goldenbull commented 8 years ago

I will try to make it as simple as possible

metadings commented 8 years ago

The new ZFrame constructor will be you biggest problem, because you need someone initializing the ZFrame to a specific Length. The other changes are in ReadString, ReadLine and WriteString, WriteLine... And you need to make ZFrameReader/Writer classes.

Let's see, now I'm curious about your PR :-) (Please try using TAB instead of SPACEs for intendation!)

sigiesec commented 7 years ago

I am a bit lost here... Was this really a bug? Or was clrzmq4 used in a wrong way? If it was a bug, was is solved by #57? Is there any reason to keep this issue open?

goldenbull commented 7 years ago

IMO, if you do not care much about the performance hit of memory copy, you can just treat ZFrame as a byte stream.

sigiesec commented 7 years ago

@goldenbull Sure you can do this.

Is there any reason to keep this issue open from your point of view?