nathanmarz / cascalog

Data processing on Hadoop without the hassle.
Other
1.38k stars 178 forks source link

"Unable to resolve symbol: compare-fn" when using :trap with c/limit #288

Open erasmas opened 9 years ago

erasmas commented 9 years ago

I found a funky issue in Cascalog 2.0. When trying to use :trap with c/ops I get following exception:

java.lang.Exception: java.lang.RuntimeException: Unable to resolve symbol: compare-fn in this context, compiling:(NO_SOURCE_PATH:191:20)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Unable to resolve symbol: compare-fn in this context, compiling:(NO_SOURCE_PATH:191:20)
    at clojure.lang.Compiler.analyze(Compiler.java:6464)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3719)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6646)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3719)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6646)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$IfExpr$Parser.parse(Compiler.java:2708)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6644)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$VectorExpr.parse(Compiler.java:3126)
    at clojure.lang.Compiler.analyze(Compiler.java:6447)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5782)
    at clojure.lang.Compiler$LetExpr$Parser.parse(Compiler.java:6100)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6644)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6632)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5782)
    at clojure.lang.Compiler$FnMethod.parse(Compiler.java:5217)
    at clojure.lang.Compiler$FnExpr.parse(Compiler.java:3846)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6642)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6632)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3719)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6646)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6632)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5782)
    at clojure.lang.Compiler$LetExpr$Parser.parse(Compiler.java:6100)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6644)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.analyze(Compiler.java:6406)
    at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5782)
    at clojure.lang.Compiler$FnMethod.parse(Compiler.java:5217)
    at clojure.lang.Compiler$FnExpr.parse(Compiler.java:3846)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6642)
    at clojure.lang.Compiler.analyze(Compiler.java:6445)
    at clojure.lang.Compiler.eval(Compiler.java:6700)
    at clojure.lang.Compiler.eval(Compiler.java:6666)
    at clojure.core$eval.invoke(core.clj:2927)
    at cascalog.logic.fn$fn__477$fn__485.invoke(fn.clj:169)
    at cascalog.logic.fn$fn__477.invoke(fn.clj:168)
    at clojure.lang.MultiFn.invoke(MultiFn.java:231)
    at cascalog.logic.fn$deserialize.invoke(fn.clj:135)
    at cascalog.Util.deserializeFn(Util.java:86)
    at cascalog.ClojureParallelAgg.prepare(ClojureParallelAgg.java:39)
    at cascalog.ClojureCombinerBase.prepare(ClojureCombinerBase.java:83)
    at cascalog.ClojureBufferCombiner.prepare(ClojureBufferCombiner.java:50)
    at cascading.flow.stream.OperatorStage.prepare(OperatorStage.java:284)
    at cascading.flow.stream.StreamGraph.prepare(StreamGraph.java:167)
    at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:113)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.RuntimeException: Unable to resolve symbol: compare-fn in this context
    at clojure.lang.Util.runtimeException(Util.java:221)
    at clojure.lang.Compiler.resolveIn(Compiler.java:6940)
    at clojure.lang.Compiler.resolve(Compiler.java:6884)
    at clojure.lang.Compiler.analyzeSymbol(Compiler.java:6845)
    at clojure.lang.Compiler.analyze(Compiler.java:6427)
    ... 69 more

Following is the code to reproduce this issue:

(let [word-counts [["david" "apple" 3]
                     ["david" "banana" 5]
                     ["david" "cherry" 4]
                     ["bob" "apple" 100]
                     ["bob" "bulgaria" 10]
                     ["bob" "cambodia" 23]
                     ["bob" "dominica" 12]
                     ["george" "apple" 20]
                     ["george" "france" 7]]]

    (??- (<-
           [?person ?word ?total]
           (word-counts ?person ?word-all ?total-all)

           (:sort ?total-all) (:reverse true)
           ((cascalog.logic.ops/limit 1) :< ?word-all ?total-all :> ?word ?total)
           (:trap (hfs-textline "/tmp/test"))
           )))

I'll try to understand how to fix it. Right now it seems like it's a serialization issue.