mdavidsaver / pvxs

PVA protocol client/server library and utilities.
https://mdavidsaver.github.io/pvxs/
Other
19 stars 25 forks source link

(De)serialization of Value fields #9

Open karlosp opened 4 years ago

karlosp commented 4 years ago

Is there a way to (de)serialize Value fields, or something similar like the example below in pvData?

   epics::pvData::PVStructure::shared_pointer structure_; 
   std::shared_ptr<epics::pvData::ByteBuffer> buffer_;
    ....
    for (size_t i = 0U; i < structure_->getNumberFields(); ++i) {
      //  Request subfield
      const epics::pvData::PVFieldPtr field = structure_->getSubField(i);

      //  If subfield has no children i.e. represents actual value then
      //  serialize it
      if ((field != nullptr) && (field->getNumberFields() == 1)) {
        field->serialize(buffer_.get(), serializable_control_.get());
      }
    }

  // deserialize
  for (size_t i = 0U; i < structure_->getNumberFields(); ++i) {
    const epics::pvData::PVFieldPtr field = structure_->getSubField(i);
    if ((field != nullptr) && (field->getNumberFields() == 1)) {
      field->deserialize(buffer_.get(), deserializable_control_.get());
    }
  }
mdavidsaver commented 4 years ago

This is not (yet) part of the public API. Can you say something about your use case? Is this for file storage, network transport, or something else? Also, what do you use for serializable_control_? endian-ness?

Have you seen the serializeToVector() and related "high level" wrappers?

https://github.com/epics-base/pvDataCPP/blob/79b02254c4f71f5eebb5d27175e5075816a64da4/src/misc/pv/serialize.h#L162-L196

karlosp commented 4 years ago

We must replace pvDataCPP lib with this one in a real-time framework that could use EPICS as a transport layer between nodes, processes, or threads.

I took a quick look on serializeToVector(), yes probably we could use that, I do not know why it was not used in the first place, but as mentioned all this code will be replaced anyway.

serializable_control_ is used only for the sake of API. virtual void Serializable::(de)serializ takes (De)serializableControl which is pure virtual class. We implemented just two functions:

void SerializableControlImpl::alignBuffer(const std::size_t alignment) {
  m_buffer_->align(alignment);
}
void SerializableControlImpl::cachedSerialize(const std::shared_ptr<const epics::pvData::Field>& field,
                                              epics::pvData::ByteBuffer* const buffer) {
  field->serialize(buffer, this);
}

Is there some timeline to make (de)serialization part of the public API?

mdavidsaver commented 4 years ago

The first step is API design. This could be as simple as moving some definitions to an installed header and/or adding simple wrappers. This could be quick.

I would like to keep the public API as small as reasonably possible. For example, I'd rather not expose the Buffer interface. However, this is the mechanism through which an underlying byte array can be automatically expanded during serialization.

There is also the question of what user container(s) types are supported. I don't want to force users to copy the serialized byte array unnecessarily. std::vector seems nice since it has resize() and can be moved/swapped without copying, provided it can be adapted into other code. eg. I know some library designs can't use external allocations.

The relevant set of internal functions are as follows:

These first two are for type descriptions (equivalent to pvData::Field::serialize()). At least a thin wrapper will be needed to use Value& instead of the internal FieldDesc*.

I'm also inclined to hide TypeStore, which fills the role of SerializableControlImpl::cachedSerialize() with a concrete implementation.

https://github.com/mdavidsaver/pvxs/blob/651d7d18fbece66f158477761e58f7d5ca915933/src/dataimpl.h#L86-L92

The other two entry points are "full" structure serialization of all fields, and "valid" serialization of some fields based on a bit mask. These correspond to the two overloads of pvData::PVStructure::serialize().

https://github.com/mdavidsaver/pvxs/blob/651d7d18fbece66f158477761e58f7d5ca915933/src/dataimpl.h#L150-L156

https://github.com/mdavidsaver/pvxs/blob/651d7d18fbece66f158477761e58f7d5ca915933/src/dataimpl.h#L162-L168

roddrok commented 3 years ago

We are developing some PVXS services that have as front-end PVXS subscribers. In all cases, our services discover the PV format from the first value but then, for the rest of received values, they only need the packed binary format of their payloads. Because the complexity of our PVs and their sampling rates, CPU usage is a critical requirement. As a first approach, I have implemented "Byte_top" and "Byte_field" functions, that basically, are the "top" and "field" functions from "datafrmt.cpp" but copying and/or translating (depending of field type) fields payload to a memory block buffer, that I had previously reserved for efficiency. The result has been really very efficient (from CPU point of view) and fast but I would like to know if there is any standard mechanisms from the API to do it.

mdavidsaver commented 3 years ago

@roddrok Your description sounds like encodeFull() or encodeValid() added in #10. Have you looked at this PR? Although I suppose that you may rather do this internally in order to have clear control over your serialization format.

roddrok commented 3 years ago

I have analysed encode.. methods and I have seen both of them are based on "to_wire". Taking a look at "to_wire" code, I think we need the same code but discarding: "type codes", end of data "0xff", array sizes, element ids, etc. We would need to extract to buffer pure payloads in order. Could be a way, to use the same "to_wire_field" code but removing the lines I have commented previously? A kind of "to_wire_payload" and "to_wire_field_payload"?