Closed Bhashinee closed 2 years ago
@Bhashinee The proposal has covered all the success cases in both listener and client.
readMessage
of the client?json|xml
.string|byte[]|xml|json|map<json>|byte[]|record {| anydata...; |}|record {| anydata...; |}[]
Also could you explain the reason for not add the support for readMessage of the client?
@chamil321 readMessage
basically returns error|byte[]|string
which means it can read both text and binary messages. Purpose of this API is to allow writing something like gateways. I think introducing data binding for such API makes no sense. Also, as mentioned earlier, we can only support binding for sub-protocols of text messages but this has both types of messages.
- I am thinking if we really need something as the below. Because at the end of the day we only can serialize either using JSON or XML. Therefore, all we need is
json|xml
.string|byte[]|xml|json|map<json>|byte[]|record {| anydata...; |}|record {| anydata...; |}[]
- I think we only need to support binding for text websocket messages. Because current serializations (JSON/XML) are text based.
Updated the proposal.
- What is the sub-protocol we set? and is there a way to override that?
We don't set a subprotocol when we send messages. The subprotocol header(which is optional) is sent with the initial HTTP request and not sent with the WebSocket frames after the connection is upgraded. So it is not possible to set any sub-protocols at the time we send data depending on the data binding. It is configurable as a ClientConfiguration(https://lib.ballerina.io/ballerina/websocket/2.2.1/records/ClientConfiguration). So it is up to the user to configure it and we will not internally override or set it based on the data binding.
remote isolated function writeTextMessage(string|xml|json data) returns Error?
We don't need string here as it part of json
remote isolated function writeTextMessage(string|xml|json data) returns Error?
We don't need string here as it part of
json
Updated the proposal.
LGTM!
@sameerajayasoma do let us know if you have any feedback.
I assume that the following works with this proposal.
Person p = check readTextMessage();
IMO, it makes sense to have only the readMessage()
and writeMessage()
methods in the client. Do we really need other write and read functions in the client? Just a thought.
remote function readMessage(typedesc<json|xml> t = <>) returns t | Error
remote function writeMessage(json|xml data) returns error?
I assume that the following works with this proposal.
Person p = check readTextMessage();
Yes, this works.
IMO, it makes sense to have only the
readMessage()
andwriteMessage()
methods in the client. Do we really need other write and read functions in the client? Just a thought.remote function readMessage(typedesc<json|xml> t = <>) returns t | Error remote function writeMessage(json|xml data) returns error?
The WebSocket protocol has this concept of binary and text frames when it comes to sending/receiving data. When we designed, we introduced those 4 separate APIs with the intention to give a clear sense of what particular data frame type is being used for the communication at the API level.
Even though we decided to update the writeTextMessage
function as follows,
remote isolated function writeTextMessage(xml|json data) returns Error?
when implementing came across an issue when working with open records as follows,
public type Coord record {
int x;
int y;
};
Coord recordVal = {x: 1, y: 2};
check wsClient->writeTextMessage(recordVal);
Above results in an error,
incompatible types: expected '(xml|json)', found 'ballerina/websocket:2.2.2:Coord'
Making the Coord
a closed record works. Even the record
type is a subtype of json
, it accepts only json
compatible ones. So basically, we can pass records that are closed or open by JSON.
working
record {|
int a;
|}
record {|
int a;
json...;
|}
not working
record {
int a;
}
so in order to support open records as we do in the HTTP module, we will have to change the writeTextMessage
API similar to the following,
remote isolated function writeTextMessage(xml|json|record {}|record {}[] data) returns Error? {
Also whatever the message that is given via this API has to be internally converted to a string
when sending the message. So when a user wants to send a string value, there is no way of differentiating whether the value is given as a string
or json
. If we use data.toJsonString();
conversion it will add additional double quotes to the string
value. If we use data.toString();
function it will not correctly convert the json
message into a string
.
Ex:
import ballerina/io;
public function main() {
string s = "hello";
io:println(s); // `hello`
io:println(s.toJsonString()); // `"hello"`
io:println(s.fromJsonString()); // error `unrecognized token 'hello'`
}
One option is to do an if
check for string
internally and skip the toJsonString()
conversion.
The problem with the above approach is, when the user wants to send a json
as a string
, it is not possible to differentiate that internally.
@shafreenAnfar WDYT?
We decided to use anydata data
instead of xml|json data
for both readTextMessage
and writeTextMessage
.
Also, I think it is good to have section about Serialization and Deserialization explaining how it works.
The WebSocket protocol has this concept of binary and text frames when it comes to sending/receiving data. When we designed, we introduced those 4 separate APIs with the intention to give a clear sense of what particular data frame type is being used for the communication at the API level.
Yeah, I think APIs should reflect that otherwise user would lose a key part of information when working with Websocket.
Also, I think it is good to have section about Serialization and Deserialization explaining how it works.
Updated the proposal.
What happens if I do this?
byte[] bytes = wsClient -> readTextMessage();
and this?
byte[] bytes = [...];
wsClient -> writeTextMessage(bytes);
What happens if I do this?
byte[] bytes = wsClient -> readTextMessage();
and this?
byte[] bytes = [...]; wsClient -> writeTextMessage(bytes);
All the types except the string
and xml
are serialized and deserialized as json
.
byte[] bytes = [...];
wsClient -> writeTextMessage(bytes);
In the above, the byte array will be converted to a json
, then to the string
(using toJsonString()
) representation of that and sent as a text frame over the network.
When reading also, it is deserialized as a json
and converted to a byte[]
.
ballerina byte[] bytes = wsClient -> readTextMessage();
This part is explained in the Message Serialization and Deserialization section of the updated proposal.
I’ve been thinking about @sameerajayasoma's and @chamil321’s suggestions on readMessage
API. Maybe that is the right thing to do in the context of data binding.
We had this separation of text and binary messages from day one (including in Ballerina 1.x). However, the addition of data binding to the existing API may not be the right thing to do. Imagine we have data binding support for both readTextMessage
and readBinaryMessage
. Then we would also need to support writing out anydata
for writeTextMessage
and writeBinaryMessage
.
However, when we look at anydata
apart from string
, json
and xml
, sending out the rest of the anydata
in the form of UTF-8 encoded text frames is ineffective. Ideally they must be serialized into binary form using protobuf, avro, etc, and sent out using binary frames.
Likewise, sending out string
, json
and xml
using binary frames is also incorrect as WebSockets provide a separate data frame type for sending out text based messages.
Therefore, I think as suggested by others what we need is writeMessage
which can select the relevant data frame type based on the input. The functionality of the writeMessage
would be as follows.
anydata
- this is based on the serializer. If the serializer is text based, write out using text frames. If the serializer is binary based, write out using binary frames. This means to match writeMessage
we need a readMessage
which does the opposite of the above. Even though writeMessage
and readMessage
APIs don't explicitly indicate the data frame type, it is okay because it is intuitive and users don't really have to worry about it. If they really do, they need to read the API docs.
Note At the moment users cannot configure or engage their own serializer/deserializer. But we can support this as the next step. The default serializer/deserializer would be text based for now. Ideally it should be binary based.
@sameerajayasoma @chamil321 @Bhashinee any thoughts ?
Looks great. This is what I had in mind when writing this comment https://github.com/ballerina-platform/ballerina-standard-library/issues/2761#issuecomment-1068658443, but you've done an excellent job of explaining it :)
I noticed that we already have a readMessage()
method with the return type string|byte[]|Error
. Changing its return type to anydata
would be a breaking change. What are your thoughts on this?
Looks great. This is what I had in mind when writing this comment #2761 (comment), but you've done an excellent job of explaining it :)
I noticed that we already have a
readMessage()
method with the return typestring|byte[]|Error
. Changing its return type toanydata
would be a breaking change. What are your thoughts on this?
We are changing the readMessage()
API as follows,
remote isolated function readMessage(typedesc<anydata> targetType = <>) returns targetType|Error = @java:Method {
So when we call the API like this,
string|byte[]|Error msg = wsClient->readMessage();
it will not give any errors because string|byte[]
is a subtype of anydata
.
To incorporate the changes, with the serializer/deserializer which will be engaged in the future we initially thought of having an if/else ladder similar to the following,
if data is string {
send as text data;
} else if data is XML {
convert to string;
send as text data;
} else if data is JSON {
convert to string;
send as text data;
} else {
engage user defined serializer;
send as binary data;
}
One of the intentions was to serialize data such as int
, boolean
etc to binary frames. But as in Ballerina those types also fall into the json
data type. There is no direct way of differentiating them from the actual json
data. So we will have to go with an updated if/else ladder similar to the following.
if data is string {
send as text data;
} else if data is XML {
convert to string;
send as text data;
} else if data is byte[] {
send as binary data;
} else {
engage user defined serializer;
send as serialized data;
}
Assume a user has engaged a binary serializer, json
data will also be binary based. If the user wants to send them as text
data, they will have to convert them to string
and pass it to the API.
For now, until the ser/des package is developed and used in standard libraries, default serialization would be text-based.
Known Issues: As of now, since there are no ser/des being engaged in the implementation, the default serialization would be text-based. So there can be ambiguous scenarios as follows
string|xml data = client->readMessage();
In the above scenario, assume we receive a text message and the user has given a union type to read that message. A text message can be deserialized into both of the types given by the user. At that time there is an ambiguity as to which data type we should convert it because we don't have any details as in which format(whether string
or xml
) it came through the connection.
Once we come up with proper serializers/deserializers, we should include that ser/des data type also in the message itself. With that, we can overcome this issue.
Summary
Data binding helps to access the incoming and outgoing data in the user's desired parameter type. Subtypes of
anydata
will be the supported parameter types. This proposal discusses ways to provide data binding for both on listener side as well as the client side.Goals
Motivation
As of now, the Ballerina WebSocket package doesn’t provide direct data binding for sending and receiving data. Only
string
andbyte[]
are the supported data types. Therefore, users have to do data manipulations by themselves. With this new feature, the user experience can be improved by introducing data binding to reduce the burden of developers converting data to the desired format as discussed in the next section.Description
Data binding support will be added to both the text and binary messages and will be supported on both the client-side and the listener side. Subtypes of
anydata
are the data types to be supported.Listener
On the listener side, a new remote function
onMessage
is introduced with data binding support. The user can state the required data type as a parameter in the remote function signature. So the received data will be converted to the requested type and dispatched to the remote function.Ex: onMessage
The newly introduced
onMessage
remote function accepts both types of text and binary data frames. This function acceptsanydata
as the function parameterData deserialization for text messages happens similar to the following,
string
, received data will be directly presented to the API without doing any deserialization.xml
, received text data will be deserialized toxml
.json
and received text data will be deserialized tojson
.Data deserialization for binary messages happens similar to the following,
byte[]
, received data will be directly presented to the API without doing any deserialization.xml
, received binary data will be first converted to the string representation of thebyte[]
and then deserialized toxml
.json
and received binary data will be first converted to the string representation of thebyte[]
and then will be deserialized tojson
.If the data binding fails, the connection will get terminated by sending a close frame with the 1003 error code(1003 indicates that an endpoint is terminating the connection because it has received a type of data it cannot accept ) and will print an error log at the listener side.
Client
Send data
A newly introduced
writeMessage
API with data binding support to write data to the connection.writeMessage
Data serialization happens as follows,
string
is given as the input, it is directly sent to the connection as text data without any data conversions.byte[]
is given, it is directly sent to the connection as binary data without any data conversions.xml
is given, it is directly converted to astring
using the BallerinatoString
function and sent as text data.json
and then to thestring
representation of that using the BallerinatoJsonString
function and sent as text data.If the data binding fails, a
websocket:Error
will be returned from the API.Receive data
To receive data, the existing
readMessage
API will be extended to supportanydata
. Data binding support will be introduced for both text messages and binary messages.The contextually-expected data type is inferred from the LHS variable type. Allowed data types would be subtypes of
anydata
.readMessage
Ex:
When the receiving data are of the text frame type, data will be deserialized to the expected data type. If the incoming data is of binary frames, and if the LHS type is some other data type apart from
byte[]
, incoming data will first be converted to the string representation of the binary data and then be converted to the expected data type.Deserialization of the above APIs happens similar to the
onMessage
deserialization. For more details refer to the relevant section.If the data binding fails, a
websocket:Error
will be returned from the API.Caller
Caller APIs will also be extended similar to the Client's
writeMessage
APIs.Testing