I was trying to parse the AVRO serialized data of the E5 engagement in order to store them in our database at university.
For CADETS E5 everything worked fine, but for THEIA E5 I encountered problems with the AVRO files themselves.
It seems that there are sporadically unreadable characters in some THEIA E5 *.gz files that are preventing the parsing of the binaries to JSON using common AVRO readers.
Also contains the JSON object (line 67) in file ta1-theia-1-e5-official-1.json.gz the key-value pair "path":"@/tmp/.X11-unix/X0�" which would not be parsable for AVRO readers, due to the last character.
Has anyone else encountered this problem or found a solution?
Hi,
I was trying to parse the AVRO serialized data of the E5 engagement in order to store them in our database at university. For CADETS E5 everything worked fine, but for THEIA E5 I encountered problems with the AVRO files themselves. It seems that there are sporadically unreadable characters in some THEIA E5 *.gz files that are preventing the parsing of the binaries to JSON using common AVRO readers. Also contains the JSON object (line 67) in file ta1-theia-1-e5-official-1.json.gz the key-value pair "path":"@/tmp/.X11-unix/X0�" which would not be parsable for AVRO readers, due to the last character.
Has anyone else encountered this problem or found a solution?
Thanks, for any response and/or help!