NationalSecurityAgency / datawave

DataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.
https://code.nsa.gov/datawave
Apache License 2.0
552 stars 236 forks source link

Update Guava #514

Open milleruntime opened 5 years ago

milleruntime commented 5 years ago

The version of Guava (15.0) that Datawave builds against won't run against Accumulo 2.0, which uses a newer version (28.0-jre).

milleruntime commented 5 years ago

Here is the exception I saw trying to connect to Accumulo:

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
    at org.apache.accumulo.core.util.HostAndPort.fromString(HostAndPort.java:171)
    at org.apache.accumulo.core.util.AddressUtil.parseAddress(AddressUtil.java:33)
    at org.apache.accumulo.core.util.ServerServices.getAddress(ServerServices.java:51)
    at org.apache.accumulo.core.clientImpl.ServerClient.getConnection(ServerClient.java:170)
    at org.apache.accumulo.core.clientImpl.ServerClient.getConnection(ServerClient.java:150)
    at org.apache.accumulo.core.clientImpl.ServerClient.getConnection(ServerClient.java:145)
    at org.apache.accumulo.core.clientImpl.ServerClient.getConnection(ServerClient.java:135)
    at org.apache.accumulo.core.clientImpl.ServerClient.executeRawVoid(ServerClient.java:114)
    at org.apache.accumulo.core.clientImpl.ServerClient.executeVoid(ServerClient.java:71)
    at org.apache.accumulo.core.clientImpl.ConnectorImpl.<init>(ConnectorImpl.java:66)
    at org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:228)
    at datawave.ingest.data.config.ingest.AccumuloHelper.getConnector(AccumuloHelper.java:87)
    at datawave.ingest.mapreduce.job.IngestJob.configureTables(IngestJob.java:740)
    at datawave.ingest.mapreduce.job.IngestJob.run(IngestJob.java:297)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at datawave.ingest.mapreduce.job.IngestJob.main(IngestJob.java:209)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
[DW-FATAL] - error creating tables
Aborting install-ingest.sh
milleruntime commented 5 years ago

It looks like upgrading to Guava 22 is pretty straight forward. I ran into issues with Futures.immediateCheckedFuture() being dropped in versions after 22 though and wasn't sure how to handle it.

milleruntime commented 4 years ago

I got a branch for Guava 22 to build here. I am seeing concurrency issues with ScannerSession when running Tests. I am not sure if its a change in behavior with the guava usage or if my changes exposed deeper issues but it seems to be issues with these changes. For instance, this error pops up (sometimes) across multiple tests. This was from AnyFieldQueryTest:

java.lang.RuntimeException: java.lang.IllegalStateException: Expected the service AnyFieldScanner [TERMINATED] to be RUNNING, but was TERMINATED
    at datawave.query.jexl.lookups.FieldNameLookup.lookup(FieldNameLookup.java:153)
    at datawave.query.jexl.visitors.ParallelIndexExpansion$IndexLookupCallable.call(ParallelIndexExpansion.java:625)
    at datawave.query.jexl.visitors.ParallelIndexExpansion$IndexLookupCallable.call(ParallelIndexExpansion.java:586)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Expected the service AnyFieldScanner [TERMINATED] to be RUNNING, but was TERMINATED
    at com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:347)
    at com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:281)
    at com.google.common.util.concurrent.AbstractExecutionThreadService.awaitRunning(AbstractExecutionThreadService.java:223)
    at datawave.query.tables.ScannerSession.hasNext(ScannerSession.java:283)
    at com.google.common.collect.MultitransformedIterator.hasNext(MultitransformedIterator.java:53)
    at datawave.query.jexl.lookups.FieldNameLookup.lookup(FieldNameLookup.java:116)