lintool / warcbase

Warcbase is an open-source platform for managing analyzing web archives
http://warcbase.org/
161 stars 47 forks source link

NoSuchFieldException in org.warcbase.data.HBaseTableManager #255

Open dedocibula opened 7 years ago

dedocibula commented 7 years ago

When ingesting WARC/ARC files to HBase via IngestFiles script (appassemble) NoSuchFieldException gets thrown by HBaseTableManager as its constructor tries to access non-existent field maxKeyValueSize on HTable object via reflection. As of hbase-client 1.2.0-cdh5.7.1 this field has been removed and direct usage of HTable has been deprecated - current workflow is driven via connection object.

In hbase-client 1.2.0 max keyvalue size gets set from general HBaseConfiguration (ConnectionManager$HConnectionImplementation) and as such the property must be overridden before HBase connection is created.

Additionally, appassemble scripts should be probably generated as part of package step on the overall repository (either that or http://lintool.github.io/warcbase-docs/Ingesting-Content-into-HBase/ should be updated to let users know that mvn appassembler:assemble needs to be run first).