twitter / scalding

A Scala API for Cascading
http://twitter.com/scalding
Apache License 2.0
3.5k stars 706 forks source link

JobTest and Bijections don't seem to play nice #334

Open mikegagnon opened 11 years ago

mikegagnon commented 11 years ago

This code causes a ClassCastException: https://github.com/mikegagnon/scalding-commons/compare/develop...commutative_writeIncrement

> sbt
> test-only *Version*
[info] Compiling 1 Scala source to /Users/mikeg/workspace/scalding-commons/target/scala-2.9.2/test-classes...
[warn] there were 1 deprecation warnings; re-run with -deprecation for details
[warn] one warning found
13/02/28 14:51:36 INFO property.AppProps: using app.id: 70639C532E75CD10FF1BD1B58FD8BA0C
13/02/28 14:51:36 INFO util.Version: Concurrent, Inc - Cascading 2.0.2
13/02/28 14:51:36 INFO flow.Flow: [com.twitter.scalding.c...] starting
13/02/28 14:51:36 INFO flow.Flow: [com.twitter.scalding.c...]  source: MemoryTap["TextDelimited[['key', 'value']]"]["0.35060793756371833"]"]
13/02/28 14:51:36 INFO flow.Flow: [com.twitter.scalding.c...]  sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.2909223830014911"]"]
13/02/28 14:51:36 INFO flow.Flow: [com.twitter.scalding.c...]  parallel execution is enabled: true
13/02/28 14:51:36 INFO flow.Flow: [com.twitter.scalding.c...]  starting jobs: 1
13/02/28 14:51:36 INFO flow.Flow: [com.twitter.scalding.c...]  allocating threads: 1
13/02/28 14:51:36 INFO flow.FlowStep: [com.twitter.scalding.c...] starting step: local
13/02/28 14:51:36 ERROR stream.TrapHandler: caught Throwable, no trap available, rethrowing
cascading.pipe.OperatorException: [class com.twitter.scal...][com.twitter.scalding.RichPipe.each(RichPipe.scala:353)] operator Each failed executing operation
    at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:94)
    at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:38)
    at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
    at cascading.flow.stream.SourceStage.call(SourceStage.java:53)
    at cascading.flow.stream.SourceStage.call(SourceStage.java:38)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to [B
    at com.twitter.bijection.NumericBijections$$anonfun$32.apply(NumericBijections.scala:82)
    at com.twitter.bijection.Bijection$$anon$2$$anon$3.apply(Bijection.scala:95)
    at com.twitter.bijection.Bijection$class.invert(Bijection.scala:31)
    at com.twitter.bijection.Bijection$$anon$2.invert(Bijection.scala:92)
    at com.twitter.bijection.GeneratedTupleBijections$$anonfun$tuple2$2.apply(GeneratedTupleBijections.scala:9)
    at com.twitter.bijection.GeneratedTupleBijections$$anonfun$tuple2$2.apply(GeneratedTupleBijections.scala:8)
    at com.twitter.bijection.Bijection$$anon$2$$anon$3.apply(Bijection.scala:95)
    at com.twitter.bijection.Bijection$class.invert(Bijection.scala:31)
    at com.twitter.bijection.Bijection$$anon$2.invert(Bijection.scala:92)
    at com.twitter.scalding.commons.source.VersionedKeyValSource$$anonfun$transformForRead$3.apply(VersionedKeyValSource.scala:101)
    at com.twitter.scalding.commons.source.VersionedKeyValSource$$anonfun$transformForRead$3.apply(VersionedKeyValSource.scala:100)
    at com.twitter.scalding.MapFunction.operate(Operations.scala:70)
    at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:86)
    ... 9 more
13/02/28 14:51:36 ERROR stream.SourceStage: caught throwable
cascading.pipe.OperatorException: [class com.twitter.scal...][com.twitter.scalding.RichPipe.each(RichPipe.scala:353)] operator Each failed executing operation
    at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:94)
    at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:38)
    at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
    at cascading.flow.stream.SourceStage.call(SourceStage.java:53)
    at cascading.flow.stream.SourceStage.call(SourceStage.java:38)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to [B
    at com.twitter.bijection.NumericBijections$$anonfun$32.apply(NumericBijections.scala:82)
    at com.twitter.bijection.Bijection$$anon$2$$anon$3.apply(Bijection.scala:95)
    at com.twitter.bijection.Bijection$class.invert(Bijection.scala:31)
    at com.twitter.bijection.Bijection$$anon$2.invert(Bijection.scala:92)
    at com.twitter.bijection.GeneratedTupleBijections$$anonfun$tuple2$2.apply(GeneratedTupleBijections.scala:9)
    at com.twitter.bijection.GeneratedTupleBijections$$anonfun$tuple2$2.apply(GeneratedTupleBijections.scala:8)
    at com.twitter.bijection.Bijection$$anon$2$$anon$3.apply(Bijection.scala:95)
    at com.twitter.bijection.Bijection$class.invert(Bijection.scala:31)
    at com.twitter.bijection.Bijection$$anon$2.invert(Bijection.scala:92)
    at com.twitter.scalding.commons.source.VersionedKeyValSource$$anonfun$transformForRead$3.apply(VersionedKeyValSource.scala:101)
    at com.twitter.scalding.commons.source.VersionedKeyValSource$$anonfun$transformForRead$3.apply(VersionedKeyValSource.scala:100)
    at com.twitter.scalding.MapFunction.operate(Operations.scala:70)
    at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:86)
    ... 9 more
13/02/28 14:51:36 INFO flow.Flow: [com.twitter.scalding.c...] stopping all jobs
13/02/28 14:51:36 INFO flow.FlowStep: [com.twitter.scalding.c...] stopping: local
13/02/28 14:51:36 INFO flow.Flow: [com.twitter.scalding.c...] stopped all jobs
[error] x A VersionedKeyValSourceSpec should
[error]   x not experience ClassCastException: java.lang.Integer cannot be cast to [B
[error]     local step failed (FlowStepJob.java:191)
[error]     cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:191)
[error]     cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:137)
[error]     cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:122)
[error]     cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:42)
[error]     java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
[error]     java.util.concurrent.FutureTask.run(FutureTask.java:166)
[error]     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
[error]     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
[error]     java.lang.Thread.run(Thread.java:722)
[error]     [class com.twitter.scal...][com.twitter.scalding.RichPipe.each(RichPipe.scala:353)] operator Each failed executing operation (FunctionEachStage.java:94)
[error]     cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:94)
[error]     cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:38)
[error]     cascading.flow.stream.SourceStage.map(SourceStage.java:102)
[error]     cascading.flow.stream.SourceStage.call(SourceStage.java:53)
[error]     cascading.flow.stream.SourceStage.call(SourceStage.java:38)
[error]     java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
[error]     java.util.concurrent.FutureTask.run(FutureTask.java:166)
[error]     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
[error]     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
[error]     java.lang.Thread.run(Thread.java:722)
[error]     java.lang.Integer cannot be cast to [B (NumericBijections.scala:82)
[error]     com.twitter.bijection.NumericBijections$$anonfun$32.apply(NumericBijections.scala:82)
[error]     com.twitter.bijection.Bijection$$anon$2$$anon$3.apply(Bijection.scala:95)
[error]     com.twitter.bijection.Bijection$class.invert(Bijection.scala:31)
[error]     com.twitter.bijection.Bijection$$anon$2.invert(Bijection.scala:92)
[error]     com.twitter.bijection.GeneratedTupleBijections$$anonfun$tuple2$2.apply(GeneratedTupleBijections.scala:9)
[error]     com.twitter.bijection.GeneratedTupleBijections$$anonfun$tuple2$2.apply(GeneratedTupleBijections.scala:8)
[error]     com.twitter.bijection.Bijection$$anon$2$$anon$3.apply(Bijection.scala:95)
[error]     com.twitter.bijection.Bijection$class.invert(Bijection.scala:31)
[error]     com.twitter.bijection.Bijection$$anon$2.invert(Bijection.scala:92)
[error]     com.twitter.scalding.commons.source.VersionedKeyValSource$$anonfun$transformForRead$3.apply(VersionedKeyValSource.scala:101)
[error]     com.twitter.scalding.commons.source.VersionedKeyValSource$$anonfun$transformForRead$3.apply(VersionedKeyValSource.scala:100)
[error]     com.twitter.scalding.MapFunction.operate(Operations.scala:70)
[error]     cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:86)
[error]     cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:38)
[error]     cascading.flow.stream.SourceStage.map(SourceStage.java:102)
[error]     cascading.flow.stream.SourceStage.call(SourceStage.java:53)
[error]     cascading.flow.stream.SourceStage.call(SourceStage.java:38)
[error]     java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
[error]     java.util.concurrent.FutureTask.run(FutureTask.java:166)
[error]     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
[error]     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
[error]     java.lang.Thread.run(Thread.java:722)
[error]     java.lang.Integer cannot be cast to [B (NumericBijections.scala:82)
[error]     com.twitter.bijection.NumericBijections$$anonfun$32.apply(NumericBijections.scala:82)
[error]     com.twitter.bijection.Bijection$$anon$2$$anon$3.apply(Bijection.scala:95)
[error]     com.twitter.bijection.Bijection$class.invert(Bijection.scala:31)
[error]     com.twitter.bijection.Bijection$$anon$2.invert(Bijection.scala:92)
[error]     com.twitter.bijection.GeneratedTupleBijections$$anonfun$tuple2$2.apply(GeneratedTupleBijections.scala:9)
[error]     com.twitter.bijection.GeneratedTupleBijections$$anonfun$tuple2$2.apply(GeneratedTupleBijections.scala:8)
[error]     com.twitter.bijection.Bijection$$anon$2$$anon$3.apply(Bijection.scala:95)
[error]     com.twitter.bijection.Bijection$class.invert(Bijection.scala:31)
[error]     com.twitter.bijection.Bijection$$anon$2.invert(Bijection.scala:92)
[error]     com.twitter.scalding.commons.source.VersionedKeyValSource$$anonfun$transformForRead$3.apply(VersionedKeyValSource.scala:101)
[error]     com.twitter.scalding.commons.source.VersionedKeyValSource$$anonfun$transformForRead$3.apply(VersionedKeyValSource.scala:100)
[error]     com.twitter.scalding.MapFunction.operate(Operations.scala:70)
[error]     cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:86)
[error]     cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:38)
[error]     cascading.flow.stream.SourceStage.map(SourceStage.java:102)
[error]     cascading.flow.stream.SourceStage.call(SourceStage.java:53)
[error]     cascading.flow.stream.SourceStage.call(SourceStage.java:38)
[error]     java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
[error]     java.util.concurrent.FutureTask.run(FutureTask.java:166)
[error]     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
[error]     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
[error]     java.lang.Thread.run(Thread.java:722)
[error] Error: Total 2, Failed 1, Errors 1, Passed 0, Skipped 0
[error] Error during tests:
[error]     com.twitter.scalding.commons.source.VersionedKeyValSourceSpec
[error] {file:/Users/mikeg/workspace/scalding-commons/}default-99d506/test:test-only: Tests unsuccessful
[error] Total time: 10 s, completed Feb 28, 2013 2:51:36 PM
sritchie commented 11 years ago

@azymnis, it looks like the data is getting mocked in underneath the call to transformForRead instead of above it. If you mock a source, I think it makes sense to NOT apply the transformForRead to the mocked data. What do you think?

johnynek commented 11 years ago

This was actually a bug in VersionKeyValSource spec that was fixed in 0.1.4.

I had to upgrade the other day.

Can you check this again?