linkedin / goavro

Apache License 2.0
982 stars 219 forks source link

Getting avro message metadata #145

Open ryosagisu opened 5 years ago

ryosagisu commented 5 years ago

In my system I need to get Schema from Avro message, and I notice we could get Schema by calling (*OCFReader).Metadata(). But when generating new OCFReader it has to generate new codec, and I see this is quite expensive to do. Ref: https://github.com/linkedin/goavro/blob/master/ocf.go#L114-L171

So, I propose to create function that will return Avro metadata when called, without creating new OCFReader. This is what I used now:

func readAvroHeader(ctx context.Context, ior io.Reader) (map[string][]byte, error) {
    span, ctx := tracer.StartSpanFromContext(ctx)
    defer span.Finish()

    //
    // magic bytes
    //
    magic := make([]byte, 4)
    _, err := io.ReadFull(ior, magic)
    if err != nil {
        return nil, fmt.Errorf("cannot read OCF header magic bytes: %s", err)
    }

    if !bytes.Equal(magic, ocfMagicBytes) {
        return nil, fmt.Errorf("cannot read OCF header with invalid magic bytes: %#q", magic)
    }

    //
    // metadata
    //
    metadata, err := metadataBinaryReader(ctx, ior)
    if err != nil {
        return nil, fmt.Errorf("cannot read OCF header metadata: %s", err)
    }
    return metadata, nil
}

What do you guys think?

karrick commented 5 years ago

in the middle of playing with this on branch https://github.com/linkedin/goavro/tree/145

karrick commented 5 years ago

While the above branch passes all the included tests, there are some overlap and redundancy between some of the data types, and I'd like to refactor a bit more before I merge it back in.

I'd be happy to have some feedback if you're interested in providing it.

ryosagisu commented 5 years ago

Sure, I've seen your code and it's much cleaner than what I've done. I'll try to play it later, when I got some times. Thanks

ryosagisu commented 2 years ago

hello, any help needed to merge branch https://github.com/linkedin/goavro/tree/145?