modelop / hadrian

Implementations of the Portable Format for Analytics (PFA)
Apache License 2.0
130 stars 49 forks source link

Hadrian and Titus do not behave the same #16

Closed bmwilly closed 8 years ago

bmwilly commented 8 years ago

I came across this bug when using la.dot, where one input is an array of array of ints and the other is an array of doubles. Hadrian complains that java.lang.Integer cannot be cast to java.lang.Double but Titus evaluates it successfully. As Hadrian and Titus are implementations of the same PFA specification, I believe they should behave the same.

The files necessary for a minimal reproducible example in this gist. To run it yourself (assuming you have pip installed), simply

git clone https://gist.github.com/8716b19618fee138f984289853d4a48b.git
cd 8716b19618fee138f984289853d4a48b
sh pfa.sh

which should output

Evaluating with Hadrian
Exception in thread "Hadrian-engine-0" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
    at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:119)
    at com.opendatagroup.hadrian.lib.la.package$Dot$$anonfun$applyVector$3$$anonfun$apply$7.apply(la.scala:514)
    at scala.collection.Iterator$class.exists(Iterator.scala:753)
    at scala.collection.AbstractIterator.exists(Iterator.scala:1157)
    at scala.collection.IterableLike$class.exists(IterableLike.scala:77)
    at scala.collection.AbstractIterable.exists(Iterable.scala:54)
    at com.opendatagroup.hadrian.lib.la.package$Dot$$anonfun$applyVector$3.apply(la.scala:514)
    at com.opendatagroup.hadrian.lib.la.package$Dot$$anonfun$applyVector$3.apply(la.scala:514)
    at scala.collection.Iterator$class.exists(Iterator.scala:753)
    at scala.collection.AbstractIterator.exists(Iterator.scala:1157)
    at scala.collection.IterableLike$class.exists(IterableLike.scala:77)
    at scala.collection.AbstractIterable.exists(Iterable.scala:54)
    at com.opendatagroup.hadrian.lib.la.package$Dot.applyVector(la.scala:514)
    at PFA_Engine_1$2.apply(Unknown Source)
    at PFA_Engine_1.action(Unknown Source)
    at PFA_Engine_1.action(Unknown Source)
    at com.opendatagroup.hadrian.standalone.Main$$anonfun$main$1$EngineRunnable$1.action(standalone.scala:196)
    at com.opendatagroup.hadrian.standalone.Main$$anonfun$main$1$EngineRunnable$1$$anonfun$19.apply(standalone.scala:221)
    at com.opendatagroup.hadrian.standalone.Main$$anonfun$main$1$EngineRunnable$1$$anonfun$19.apply(standalone.scala:221)
    at com.opendatagroup.hadrian.standalone.Main$$anonfun$main$1$EngineRunnable$1.run(standalone.scala:232)
    at java.lang.Thread.run(Thread.java:745)

Evaluating with Titus
[1.0, 0.0]
collinbennett commented 8 years ago

The Hadrian behavior is the correct one. The is no version of la.dot that takes an array of ints and doubles so this call should fail with the above exception.

There are a few cases that we have encountered where Python performs dynamic casting in Titus which allows a function to accept input that it strictly shouldn't. The only fix would be to go into Titus and force those conditions to break, which we do not currently want to do.

solongordon commented 8 years ago

Thanks for the response Collin. (I work with bmwilly.) I was wondering if you could expand on why you don't want to force these kinds of things to break in Titus.

Our current approach is to use Titus for model development and Hadrian for production scoring. But it sounds like we'll have to abandon that approach if we can't expect the same results from both engines. Are we misunderstanding the intended usage for Titus? Do you recommend against mixing and matching the two?

jpivarski commented 8 years ago

Collin's right that we've had problems matching Python's dynamism with Scala's strict typing, but in this case, I think Hadrian's wrong.

The first indication is the fact that we're seeing an exception that doesn't derive from PFAException. It's a ClassCastException at runtime. If passing an array of ints into la.dot were truly illegal, then there should be some PFASemanticException saying that's a type error. That's how PFA is supposed to protect users from runtime errors.

Passing an array of ints into la.dot isn't a type error because PFA arrays are covariant: array(int) is a subtype of array(double) because int is a subtype of double: PFA specification page 38. When Python dynamically converts an int into numpy.double in its implementation of la.dot, it's following the specification, doing something that was allowed by the type-check.

When Scala refuses to convert a (boxed) java.lang.Integer into a java.lang.Double, it's doing the wrong thing, something that the type-check didn't protect against. There's an asDouble function in jvmcompiler.scala to fight against JVM boxing and get everything into the right numeric type. It was applied to other cases where an array of double is needed; this one somehow got missed.

In general, you should be able to use Titus for testing and Hadrian for production, except when there's a bug.