google / protobuf.dart

Runtime library for Dart protobufs
https://pub.dev/packages/protobuf
BSD 3-Clause "New" or "Revised" License
533 stars 184 forks source link

Generate `bytes` with `Uint8List` #219

Open truongsinh opened 5 years ago

truongsinh commented 5 years ago

Ref: https://groups.google.com/a/dartlang.org/forum/#!topic/misc/JpNkRRLI9_w, https://github.com/Daegalus/dart-uuid/issues/35

More and more packages are explicitly switching to Uint8List

Right now, with the folling proto

message BluetoothCharacteristic {
  bytes value = 6;
}

The generated dart code (by Activated protoc_plugin 16.0.1.) is

import 'dart:core' show int, bool, double, String, List, Map, override;

import 'package:protobuf/protobuf.dart' as $pb;

class BluetoothCharacteristic extends $pb.GeneratedMessage {
  static final $pb.BuilderInfo _i = new $pb.BuilderInfo('BluetoothCharacteristic')
    ..a<List<int>>(6, 'value', $pb.PbFieldType.OY)
    ..hasRequiredFields = false
  ;

  BluetoothCharacteristic() : super();
  BluetoothCharacteristic.fromBuffer(List<int> i, [$pb.ExtensionRegistry r = $pb.ExtensionRegistry.EMPTY]) : super.fromBuffer(i, r);
  BluetoothCharacteristic.fromJson(String i, [$pb.ExtensionRegistry r = $pb.ExtensionRegistry.EMPTY]) : super.fromJson(i, r);
  BluetoothCharacteristic clone() => new BluetoothCharacteristic()..mergeFromMessage(this);
  BluetoothCharacteristic copyWith(void Function(BluetoothCharacteristic) updates) => super.copyWith((message) => updates(message as BluetoothCharacteristic));
  $pb.BuilderInfo get info_ => _i;
  static BluetoothCharacteristic create() => new BluetoothCharacteristic();
  BluetoothCharacteristic createEmptyInstance() => create();
  static $pb.PbList<BluetoothCharacteristic> createRepeated() => new $pb.PbList<BluetoothCharacteristic>();
  static BluetoothCharacteristic getDefault() => _defaultInstance ??= create()..freeze();
  static BluetoothCharacteristic _defaultInstance;

  List<int> get value => $_getN(0);
  set value(List<int> v) { $_setBytes(0, v); }
  bool hasValue() => $_has(0);
  void clearValue() => clearField(6);
}

We can see that we still have

  List<int> get value => $_getN(0);
  set value(List<int> v) { $_setBytes(0, v); }

Expected generated code:

  Uint8List get value => $_getN(0);
  set value(Uint8List v) { $_setBytes(0, v); }

One concern might be that this is more or less a breaking change in strong mode.

sigurdm commented 5 years ago

Can you be more specific? GeneratedMessage.writeToBuffer already returns a Uint8List.

truongsinh commented 5 years ago

@sigmundch I updated the description to have current behavior and desired behavior.

njskalski commented 4 years ago

Just hit the same wall.

rajveermalviya commented 4 years ago

@sigurdm Any traction on this?

most of the dart:core is using Uint8List for bytes, instead of List<int>, since Dart v2.5.0.

ref: https://github.com/dart-lang/sdk/issues/36900

osa1 commented 2 years ago

Relatedly, we currently do not convert List<int> values to Uint8List when a field is set. For example:

  @$pb.TagNumber(15)
  set optionalBytes($core.List<$core.int> v) {
    $_setBytes(14, v);
  }

which calls https://github.com/google/protobuf.dart/blob/14c9c0b2d5542e73198a98054d93f0cb4acc846a/protobuf/lib/src/protobuf/generated_message.dart#L456-L457

So the value is stored as List<int>, which is wasteful because I suspect most subtypes won't be as compact as Uint8List. Also when serializing, we allocate a Uint8List every time we serialize a bytes field: https://github.com/google/protobuf.dart/blob/14c9c0b2d5542e73198a98054d93f0cb4acc846a/protobuf/lib/src/protobuf/coded_buffer_writer.dart#L339-L342

I think the "ideal" solution would be to take Uint8List values in setters and return Uint8List in getters, and store bytes values as Uint8List. Uint8List is the most precise type for what a "bytes" field is, and it's more efficient than storing each byte as an int.

That's a breaking change though and I'm not sure if we can afford it internally. So a close second could be keeping the API as-is, but storing the values as Uint8List. I think technically this would not be considered a breaking change, though I wouldn't surprise if it breaks some code because someone sets a bytes field an int array or something like that, and downcasts the getter return value.

osa1 commented 2 years ago

Also when serializing, we allocate a Uint8List every time we serialize a bytes field:

I just realized that this is not true, because Uint8List is a subtype of TypedData, so in the code https://github.com/google/protobuf.dart/blob/14c9c0b2d5542e73198a98054d93f0cb4acc846a/protobuf/lib/src/protobuf/coded_buffer_writer.dart#L339-L342

We don't allocate a Uint8List if the field value is already an Uint8List.

Another strange thing that I realized while reading the 4 lines shown above is that when we set a bytes field an List<int> with integers larger than max value of a byte, truncation is done when serializing, not when setting the field. Example:

syntax = "proto3";

message M1 {
  bytes b = 1;
}
M1 m = M1();
m.b = <int>[99999];
print(m.b); // [99999]
print(m.writeToBuffer()); // [10, 1, 159]

The reason why this is strange is because if I send this message to another program I would expect the local value for the message and the received message value to be the same, but if both the sender and the receiver use the bytes field for the same operation they will potentially do different things as the bytes values are different.

Oh, and also, the JSON map and proto3 JSON serializers throw an exception when serializing this field.. So the library is also inconsistent in handling of such values.

brandsimon commented 2 years ago

I understand that this could break someones code and is not backwards compatible, but I think having an Uint8List as bytes is the correct way to do it. So how about adding a comment to the .proto file or a command-line option for the protobuf-compiler which changes the resulting class? So people can choose which type they want.

hagerf commented 1 year ago

Hi Is there any development on this? I agree with everyone here that Uint8List is the proper type for bytes. But is it probable that these changes will be made anytime or are we probably stuck with current implementation for the foreseeable future? Thanks

temeddix commented 1 year ago

This is indeed a serious issue. List<int> is known to be much slower than Uint8List. Any progress?