mathworks-ref-arch / matlab-avro

MATLAB interface for Apache Avro files.
Other
9 stars 2 forks source link

snappy compression support #5

Open mlandry1 opened 3 years ago

mlandry1 commented 3 years ago

Hi.

Great project. I am using MATLAB 2020a. I am new to this so please excuse me if this is an obvious question.

Is it possible to read a snappy compressed avro file and if it is, how?

When I am trying to read such a file, I'm getting the following.

>> myReader = matlabavro.DataFileReader('snappy_compressed_file.avro')
Error using matlabavro.DataFileReader (line 31)
Java exception occurred:
org.apache.avro.AvroRuntimeException: Unrecognized codec: snappy

    at org.apache.avro.file.CodecFactory.fromString(CodecFactory.java:144)

    at org.apache.avro.file.DataFileStream.resolveCodec(DataFileStream.java:145)

    at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:131)

    at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:106)

    at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:93)

Also I get this warning when I run the unit tests.

>> runtests
Running BasicAvro
Warning: Snappy codec not found. Install the snappy compression library to use this option.
Proceeding with deflate compression level 6. 
> In matlabavro.DataFileWriter.set.compressionType (line 59)
  In BasicAvro/testScalar (line 54)

Thanks for your help!

vveerapp commented 3 years ago

Hi

You should be able to use this compression if you have the snappy codec installed. Can you confirm for me if you have the snappy compression library? Thanks

mlandry1 commented 3 years ago

I would say I don't. I have the python library but nothing externally installed for Matlab. Which library do I have to install ? Is it this one? Could you reference it in the readme.md?

Fun fact : despite not being able to open my snappy compressed AVRO files with this code in Matlab. I am able to open snappy compressed Parquet files with Matlab's native commands without problems.

Thanks !

vveerapp commented 3 years ago

Yes, please try that library and let me know. Thanks

AntonSemechko commented 2 years ago

runtests produces multiple warnings of the type:

Warning: Snappy codec not found. Install the snappy compression library to use this option. Proceeding with deflate compression level 6.

In matlabavro.DataFileWriter.set.compressionType (line 59) In BasicAvro/testMixedTypeStruct (line 467)

Please update your documentation to include instructions for the installation of the Snappy codec.

DaveForstot-MathWorks commented 2 years ago

Hi,

The maven project has been updated to pull in the Snappy codec. Can you pull/download the latest and verify it is now working correctly?

Thanks!