elodina / go-avro

Apache Avro for Golang
http://elodina.github.io/go-avro/
Apache License 2.0
129 stars 55 forks source link

NewDataFileReader returns all nil values on fields and ends with "Block read is unfinished" #91

Open RoarkeRandall opened 7 years ago

RoarkeRandall commented 7 years ago

Here's my code (pretty much copy pasted from the example)...

                reader, err := avro.NewDataFileReader(fileName, avro.NewSpecificDatumReader())
        if err != nil {
            fmt.Println(err)
            return
        }
        for {
            obj := &PADirectJustListedItem{}
            ok, err := reader.Next(obj)
            if !ok {
                if err != nil {
                    fmt.Println(err)
                    return
                }
                break
            } else {
                fmt.Printf("%#v\n", obj)
            }
        }

ouput: go run main.go &main.PADirectJustListedItem{snapshotdate:(int64)(nil), propertyid:(int32)(nil), accountid:(int32)(nil), bedrooms:(int)(nil), bathrooms:(string)(nil), finishedsqft:(int)(nil), lotsizesqft:(int)(nil), city:(string)(nil), state:(string)(nil), postalcode:(string)(nil), propetyaddress:(string)(nil), image1id:(int64)(nil), image2id:(int64)(nil), image3id:(int64)(nil), manualimageid:(int64)(nil), sellingpricedollarcnt:(int32)(nil), realestatebrokerid:(int32)(nil), daysonzillow:(int32)(nil), multiplelistingservicecode:(string)(nil), postingid:(int32)(nil), postingdateinitial:(int64)(nil), auditdatecreated:(int64)(nil)} &main.PADirectJustListedItem{snapshotdate:(int64)(nil), propertyid:(int32)(nil), accountid:(int32)(nil), bedrooms:(int)(nil), bathrooms:(string)(nil), finishedsqft:(int)(nil), lotsizesqft:(int)(nil), city:(string)(nil), state:(string)(nil), postalcode:(string)(nil), propetyaddress:(string)(nil), image1id:(int64)(nil), image2id:(int64)(nil), image3id:(int64)(nil), manualimageid:(int64)(nil), sellingpricedollarcnt:(int32)(nil), realestatebrokerid:(int32)(nil), daysonzillow:(int32)(nil), multiplelistingservicecode:(string)(nil), postingid:(int32)(nil), postingdateinitial:(int64)(nil), auditdatecreated:(int64)(nil)} Block read is unfinished

I have a couple files to test with. All of them I'm able to use the avro tools to convert them to json and it works fine:

java -jar avro-tools-1.8.2.jar tojson part-m-00000.avro > 00001.json
RoarkeRandall commented 7 years ago

here's our schema... I think the problem relates to how null is being handled.

{
    "type":"record",
    "name":"QueryResult",
    "doc":"Sqoop import of QueryResult",
    "fields":[
        {"name":"snapshotdate","type":["null","long"],"default":null,"columnName":"snapshotdate","sqlType":"91"},
        {"name":"propertyid","type":["null","int"],"default":null,"columnName":"propertyid","sqlType":"4"},
        {"name":"accountid","type":["null","int"],"default":null,"columnName":"accountid","sqlType":"4"},
        {"name":"bedrooms","type":["null","int"],"default":null,"columnName":"bedrooms","sqlType":"5"},
        {"name":"bathrooms","type":["null","string"],"default":null,"columnName":"bathrooms","sqlType":"3"},
        {"name":"finishedsqft","type":["null","int"],"default":null,"columnName":"finishedsqft","sqlType":"4"},
        {"name":"lotsizesqft","type":["null","int"],"default":null,"columnName":"lotsizesqft","sqlType":"4"},
        {"name":"city","type":["null","string"],"default":null,"columnName":"city","sqlType":"12"},
        {"name":"state","type":["null","string"],"default":null,"columnName":"state","sqlType":"1"},
        {"name":"postalcode","type":["null","string"],"default":null,"columnName":"postalcode","sqlType":"1"},
        {"name":"propetyaddress","type":["null","string"],"default":null,"columnName":"propetyaddress","sqlType":"12"},
        {"name":"image1id","type":["null","long"],"default":null,"columnName":"image1id","sqlType":"-5"},
        {"name":"image2id","type":["null","long"],"default":null,"columnName":"image2id","sqlType":"-5"},
        {"name":"image3id","type":["null","long"],"default":null,"columnName":"image3id","sqlType":"-5"},
        {"name":"manualimageid","type":["null","long"],"default":null,"columnName":"manualimageid","sqlType":"-5"},
        {"name":"sellingpricedollarcnt","type":["null","int"],"default":null,"columnName":"sellingpricedollarcnt","sqlType":"4"},
        {"name":"realestatebrokerid","type":["null","int"],"default":null,"columnName":"realestatebrokerid","sqlType":"4"},
        {"name":"daysonzillow","type":["null","int"],"default":null,"columnName":"daysonzillow","sqlType":"4"},
        {"name":"multiplelistingservicecode","type":["null","string"],"default":null,"columnName":"multiplelistingservicecode","sqlType":"12"},
        {"name":"postingid","type":["null","int"],"default":null,"columnName":"postingid","sqlType":"4"},
        {"name":"postingdateinitial","type":["null","long"],"default":null,"columnName":"postingdateinitial","sqlType":"93"},
        {"name":"auditdatecreated","type":["null","long"],"default":null,"columnName":"auditdatecreated","sqlType":"93"}
    ],
    "tableName":"QueryResult"
}
serejja commented 7 years ago

hi @RoarkeRandall, this repo is unfortunately abandoned as the maintainer has left and didn't leave push/merge rights to anyone. There's a maintained fork here - https://github.com/go-avro/avro

crast commented 6 years ago

@RoarkeRandall do you have them as public struct fields? From your output I suggest they don't, the go-avro reader cannot decode into private struct fields, it's a language/library limitation.

In either case, as mentioned above there is a maintained fork, and this branch might interest you: https://github.com/go-avro/avro/pull/7

RoarkeRandall commented 6 years ago

Sorry, I should have closed this awhle ago. Yes, that is the case