Open wesbiggs opened 1 year ago
Should we break out the data serialization specifics from the announcement pages?
This would not be a spec change, but just a re-org around where we specify the serialization from "spec type" to Parquet and Avro.
Worth a WIP PR for just one to see what it looks like?
Suggestion that the spec give recommendations on which fields are important to be indexed, even if not in a batch file Bloom filter.
These were discussed on community call 2023-07-20 and no objections were raised; next step is to draft a PR for review.
The spec allows implementations to define which Announcement Types can be used with Publish Announcement (singular announcement) and which can be used with Publish Batch (up to 131,072 announcements at a time).
The Parquet format was selected for use with off-chain batch publications for various good reasons, but some of these reasons (inclusion of a Bloom Filter, for example) are less useful (or detrimental) when dealing with an individual announcement.
At present the spec does not mandate a particular serialization for the Announcement parameter in the Publish Announcement Operation, presumably leaving this to the implementation.
With the proposal for user data operations (#233) we are bringing in usage of the Avro serialization format. We should discuss whether it is useful to define the Avro data types for individual announcements as well, and specify that individual announcements be serialized into this format at the DSNP spec level.