nielsbasjes / yauaa

Yet Another UserAgent Analyzer
https://yauaa.basjes.nl
Apache License 2.0
774 stars 131 forks source link

Commons-logger breaks Drill UDF #204

Closed cgivre closed 4 years ago

cgivre commented 4 years ago

Describe the bug As of version 5.13, yauaa breaks the Apache Drill UDFs. The issue is that commons-logging is a banned dependency in Drill. I attempted to exclude it in the pom.xml as shown below.

 <dependency>
      <groupId>nl.basjes.parse.useragent</groupId>
      <artifactId>yauaa</artifactId>
      <version>5.15</version>
      <exclusions>
        <exclusion>
          <artifactId>commons-logging</artifactId>
          <groupId>commons-logging</groupId>
        </exclusion>
      </exclusions>
    </dependency>

However, this results in the errors below in the unit tests. Version 5.13 is the last version which works without issues. I attempted to redirect the logs by directly by following the instructions here 1 but that didnt' seem to work either. Any suggestions?

org.apache.drill.exec.rpc.RpcException: org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: Code generation error - likely code error.

Fragment: 0:0

Please, refer to logs for more information.

[Error Id: 8cc8410d-3b91-43a2-96c6-ed601943ea47 on 192.168.1.26:31013]

    at org.apache.drill.exec.rpc.RpcException.mapException(RpcException.java:60)
    at org.apache.drill.exec.client.DrillClient$ListHoldingResultsListener.getResults(DrillClient.java:881)
    at org.apache.drill.exec.client.DrillClient.runQuery(DrillClient.java:583)
    at org.apache.drill.test.QueryBuilder.results(QueryBuilder.java:331)
    at org.apache.drill.test.ClusterFixture$FixtureTestServices.testRunAndReturn(ClusterFixture.java:620)
    at org.apache.drill.test.DrillTestWrapper.testRunAndReturn(DrillTestWrapper.java:938)
    at org.apache.drill.test.DrillTestWrapper.compareUnorderedResults(DrillTestWrapper.java:533)
    at org.apache.drill.test.DrillTestWrapper.run(DrillTestWrapper.java:172)
    at org.apache.drill.test.TestBuilder.go(TestBuilder.java:145)
    at org.apache.drill.exec.udfs.TestUserAgentFunctions.testParseUserAgentString(TestUserAgentFunctions.java:75)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: Code generation error - likely code error.

Fragment: 0:0

Please, refer to logs for more information.

[Error Id: 8cc8410d-3b91-43a2-96c6-ed601943ea47 on 192.168.1.26:31013]
    at org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:125)
    at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
    at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
    at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
    at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
    ... 1 more
Caused by: org.apache.drill.exec.exception.ClassTransformationException: java.util.concurrent.ExecutionException: org.apache.drill.exec.exception.ClassTransformationException: org.codehaus.commons.compiler.CompileException: Line 85, Column 10: Assignment conversion not possible from type "nl.basjes.parse.useragent.AbstractUserAgentAnalyzerDirect" to type "nl.basjes.parse.useragent.UserAgentAnalyzerDirect"
    at org.apache.drill.exec.compile.CodeCompiler.createInstances(CodeCompiler.java:197)
    at org.apache.drill.exec.compile.CodeCompiler.createInstance(CodeCompiler.java:163)
    at org.apache.drill.exec.ops.BaseFragmentContext.getImplementationClass(BaseFragmentContext.java:60)
    at org.apache.drill.exec.physical.impl.project.ProjectionMaterializer.generateProjector(ProjectionMaterializer.java:140)
    at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput(ProjectRecordBatch.java:293)
    at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:271)
    at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:92)
    at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:87)
    at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:170)
    at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:237)
    at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
    at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:111)
    at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:59)
    at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:87)
    at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:170)
    at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:237)
    at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:103)
    at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:83)
    at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93)
    at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:323)
    at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:310)
    at .......(:0)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:310)
    at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
    at .......(:0)
Caused by: java.util.concurrent.ExecutionException: org.apache.drill.exec.exception.ClassTransformationException: org.codehaus.commons.compiler.CompileException: Line 85, Column 10: Assignment conversion not possible from type "nl.basjes.parse.useragent.AbstractUserAgentAnalyzerDirect" to type "nl.basjes.parse.useragent.UserAgentAnalyzerDirect"
    at org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:502)
    at org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:461)
    at org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:83)
    at org.apache.drill.shaded.guava.com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:142)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2453)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2417)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2299)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2212)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache.get(LocalCache.java:4147)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4151)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:5140)
    at org.apache.drill.exec.compile.CodeCompiler.createInstances(CodeCompiler.java:186)
    ... 25 more
Caused by: org.apache.drill.exec.exception.ClassTransformationException: org.codehaus.commons.compiler.CompileException: Line 85, Column 10: Assignment conversion not possible from type "nl.basjes.parse.useragent.AbstractUserAgentAnalyzerDirect" to type "nl.basjes.parse.useragent.UserAgentAnalyzerDirect"
    at org.apache.drill.exec.compile.ClassBuilder.getImplementationClass(ClassBuilder.java:117)
    at org.apache.drill.exec.compile.CodeCompiler$CodeGenCompiler.compile(CodeCompiler.java:73)
    at org.apache.drill.exec.compile.CodeCompiler.makeClass(CodeCompiler.java:229)
    at org.apache.drill.exec.compile.CodeCompiler.access$300(CodeCompiler.java:41)
    at org.apache.drill.exec.compile.CodeCompiler$Loader.load(CodeCompiler.java:212)
    at org.apache.drill.exec.compile.CodeCompiler$Loader.load(CodeCompiler.java:209)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3708)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2416)
    ... 31 more
Caused by: java.lang.Exception: Line 85, Column 10: Assignment conversion not possible from type "nl.basjes.parse.useragent.AbstractUserAgentAnalyzerDirect" to type "nl.basjes.parse.useragent.UserAgentAnalyzerDirect"
    at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12211)
    at org.codehaus.janino.UnitCompiler.assignmentConversion(UnitCompiler.java:11062)
    at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3790)
    at org.codehaus.janino.UnitCompiler.access$6100(UnitCompiler.java:215)
    at org.codehaus.janino.UnitCompiler$13.visitAssignment(UnitCompiler.java:3754)
    at org.codehaus.janino.UnitCompiler$13.visitAssignment(UnitCompiler.java:3734)
    at org.codehaus.janino.Java$Assignment.accept(Java.java:4477)
    at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3734)
    at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2360)
    at org.codehaus.janino.UnitCompiler.access$1800(UnitCompiler.java:215)
    at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1494)
    at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1487)
    at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2874)

Component where the bug happens [ ] Core analyzer [X] UDF : Drill [ ] Other

To Reproduce Steps or code fragment to reproduce the behavior:

  1. 'Update the pom.xml file to use the latest version and attempt to build.
  2. '...'

Expected behavior UDF Should work

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

nielsbasjes commented 4 years ago

Hi Charles, Thanks for bringing this up. I had a close look at the stacktraces you provided and I also had a look at the dependencies of yauaa.

Running this mvn dependency:tree -Dincludes=commons-logging did not result in anything for the core yauaa library and resulted in only these transitive dependencies coming from drill-java-exec (1.17.0):

[INFO] ---------------< nl.basjes.parse.useragent:yauaa-drill >----------------
[INFO] Building Yauaa : UDF : Apache Drill : Function 5.16-SNAPSHOT     [11/26]
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ yauaa-drill ---
[INFO] nl.basjes.parse.useragent:yauaa-drill:jar:5.16-SNAPSHOT
[INFO] \- org.apache.drill.exec:drill-java-exec:jar:1.17.0:provided
[INFO]    \- org.apache.hadoop:hadoop-common:jar:3.2.1:provided
[INFO]       \- commons-logging:commons-logging:jar:1.1.3:provided
[INFO] 
[INFO] ------------< nl.basjes.parse.useragent:yauaa-drill-tests >-------------
[INFO] Building Yauaa : UDF : Apache Drill : Tests 5.16-SNAPSHOT        [12/26]
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ yauaa-drill-tests ---
[INFO] nl.basjes.parse.useragent:yauaa-drill-tests:jar:5.16-SNAPSHOT
[INFO] \- org.apache.drill.exec:drill-java-exec:jar:1.17.0:test
[INFO]    \- org.apache.hadoop:hadoop-common:jar:3.2.1:test
[INFO]       \- commons-logging:commons-logging:jar:1.1.3:test
[INFO] 

Looking more closely at the stacktrace you provided I noticed that the error I find there is completely different: Assignment conversion not possible from type "nl.basjes.parse.useragent.AbstractUserAgentAnalyzerDirect" to type "nl.basjes.parse.useragent.UserAgentAnalyzerDirect"

Which sounds correct since AbstractUserAgentAnalyzerDirect is the superclass of UserAgentAnalyzerDirect.

The way the Builder pattern for Yauaa was constructed has been changed between 5.13 and 5.14 so that users from other languages (like Scala) can use it a lot easier.

This builder uses generics to run the builder in such a way that subclasses do not need to re-implement all builder methods with an 'empty' dummy.

This it looks very likely that this change is (or is triggering) the real problem. My current hunch is that the Drill code compilation subsystem does not like this builder setup.

I'll have a closer look if I can find what is going wrong.

nielsbasjes commented 4 years ago

I notice in the Drill code that you are using the UserAgentAnalyzerDirect (which does not do caching) instead of the UserAgentAnalyzer (which does caching). Is there a reason for that? Perhaps Drill is already doing caching because this function is deterministic (i.e. always the same result)?

nielsbasjes commented 4 years ago

I have found a workaround. I'm currently releasing a new Yauaa version. After that I'll put up a pull request for Drill.

nielsbasjes commented 4 years ago

@cgivre I just opened a pull request for you to review https://github.com/apache/drill/pull/2044

nielsbasjes commented 4 years ago

Closing this as the pull request seems to be an accepted solution.