Azure / azure-sdk

This is the Azure SDK parent repository and mostly contains documentation around guidelines and policies as well as the releases for the various languages supported by the Azure SDK.
http://azure.github.io/azure-sdk
MIT License
487 stars 297 forks source link

Board Discussion: Multi-version support, Avro support #969

Closed kurtzeborn closed 4 years ago

kurtzeborn commented 4 years ago

Storage is the first track 2 library to cover this ground, but all libraries will need to do this soon. We want to discuss the best path forward to support multiple versions of the service API in the client and appropriate ways to fail in each language when the version requested does not support the SDK entrypoint being used.

Second topic (same meeting please): We want to discuss how the new storage feature support for QuickQuery and ChangeFeed can be implemented within the track 2 guidelines while supporting a service API implementation which returns Avro content. QuickQuery is coming in STG73. ChangeFeed is not versioned and support is being added now.

KrzysztofCwalina commented 4 years ago

@seanmcc-msft, I have heard you prototyped/explored alternative ways of doing the version checks, including EnsureVersion helpers that could be called within service method bodies. Do you still have branches with the prototypes? I think such prototypes would be good to look at when we evaluate the pros and cons of the options.

cc: @tg-msft

tg-msft commented 4 years ago

@seanmcc-msft - Could you also point out the operation level checks for snapshots that were problematic in Track 1? It would be helpful to get some context around that as well.

tg-msft commented 4 years ago

@JonathanGiles, @johanste, @bterlson, @JeffreyRichter - Could you please take a quick look at Avro support for Java/Python/JS/Go before this discussion? It would be super helpful to understand options across all languages.

johanste commented 4 years ago

For Python, there are several packages available. The "official" packages are(avro and avro-python3). The former claims support for Python 2.7 and 3.5+ in its package metadata, but the description says "bindings for python 2" in some places. The latter only supports python 3.4+. Based on a cursory glance, they are both pure python packages. Purely based on release history, the projects seem to be maintained.

Neither state external dependencies, and it doesn't seem like they have binary dependencies/dependencies on other libraries being installed on the machine (e.g. libavro-dev).

There are additional packages that claim to be faster (e.g. fastavro) - but that does require a binary extension.

adrianhall commented 4 years ago

Has AVRO support gone through API review / Office of the CTO review? It seems to me that this is a reasonable expectation so that services that have similar requirements all converge on the same solution, which would make it easier to contemplate from an SDK perspective.

bterlson commented 4 years ago

For JS, there seems to be two reasonable options:

avsc

npm: https://www.npmjs.com/package/avsc github: https://github.com/mtth/avsc downloads: 25k/week last release: 1 month ago dependencies: 0

avro-js

npm: https://www.npmjs.com/package/avro-js github: https://github.com/apache/avro (monorepo of many language bindings) downloads: 0.5k/week last release: 5 months ago. dependencies: 1 (underscore 🙁)

Remarks

avsc is vastly more popular and seems actively maintained. Avro-js has the benefit of coming from a well-known organization that we have a history of working well with and having bindings for many different languages. I would lean toward avsc however since it's more broadly used.

JeffreyRichter commented 4 years ago

For Go:

· Linked-In has an Avro packagehttps://github.com/linkedin/goavro last updated October 2019, 577 stars, 131 forks. Apache license (but Linked-In is MS now so not sure if/how this impacts things). This package depends on Google Snappy for (de)compression support.

· https://github.com/go-avro/avro was last updated on 12/2017. Apache 2 license. 38 stars, 18 forks. I’d avoid this package due to lack of interest/maintenance.

This is really about it; I’m surprised to see the lack of Go support for Avro. If we need to do this, then I think the Linked-In (and Google Snappy) package is our best bet.

From: Brian Terlson notifications@github.com Sent: Friday, January 24, 2020 2:10 PM To: Azure/azure-sdk azure-sdk@noreply.github.com Cc: Jeffrey Richter jeffrichter@live.com; Mention mention@noreply.github.com Subject: Re: [Azure/azure-sdk] Board Discussion: Multi-version support, Avro support (#969)

For JS, there seems to be two reasonable options:

avsc

npm: https://www.npmjs.com/package/avsc github: https://github.com/mtth/avsc downloads: 25k/week last release: 1 month ago dependencies: 0

avro-js

npm: https://www.npmjs.com/package/avro-js github: https://github.com/apache/avro (monorepo of many language bindings) downloads: 0.5k/week last release: 5 months ago. dependencies: 1 (underscore 🙁)

Remarks

avsc is vastly more popular and seems actively maintained. Avro-js has the benefit of coming from a well-known organization that we have a history of working well with and having bindings for many different languages. I would lean toward avsc however since it's more broadly used.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Azure/azure-sdk/issues/969?email_source=notifications&email_token=AARLJP7AO6GQYKFXNALLA3TQ7NRL3A5CNFSM4KK72PI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ4H3NA#issuecomment-578321844, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AARLJP5EUQQYUQWCGE5ZVL3Q7NRL3ANCNFSM4KK72PIQ.

JonathanGiles commented 4 years ago

The only option I could find for Java is Apache Avro. It seems widely used (all spring tutorials even made use of it), and relatively frequently updated. It has dependencies on Jackson (which is fine, although it does lag behind the version we use), and Apache commons-compress (which I would rather we avoid but that is unreasonable). We would ideally see a release of Avro with an updated Jackson dependency - we might want to file an issue and contribute a patch that does this.

tg-msft commented 4 years ago

I implemented a proof of concept Changefeed client at https://gist.github.com/tg-msft/d45b5e0351f0d668d2fb578df625b0fb using only System.Text.Json and Azure.Storage.Blobs. It can parse enough Avro to support Changefeed scenarios in about 200 lines of sloppy prototype code.

adrianhall commented 4 years ago

Recording: https://msit.microsoftstream.com/video/38251aaa-a391-4ad5-b3e8-f3d6fe62deaa

Follow up: #1 Peter Marino - will return to Storage and scope down to the POST/POST/PATCH/DELETE.

2 Anna Tant - will tell Peter Marino which APIs fail to fail.

DECISION: Storage will go back and scope it to "valuable and correct" instead of relying on the policy / Azure Core.

Follow-up: #1 Peter Marino - figure out if we can get out of changefeed AVRO support.

2 Peter Marino - write small minimal parsers based on quick query parser.