RumbleDB / rumble

⛈️ RumbleDB 1.22.0 "Pyrenean oak" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
http://rumbledb.org/
Other
213 stars 82 forks source link

Stage failure #18

Closed ghislainfourny closed 5 years ago

ghislainfourny commented 6 years ago

This query (from the JSONiq tutorial) fails. It is worth investigating, but may be complex to solve (it may come down to the nested for, which we can discuss)

let $stores := [ { "store number" : 1, "state" : "MA" }, { "store number" : 2, "state" : "MA" }, { "store number" : 3, "state" : "CA" }, { "store number" : 4, "state" : "CA" } ] let $sales := [ { "product" : "broiler", "store number" : 1, "quantity" : 20 }, { "product" : "toaster", "store number" : 2, "quantity" : 100 }, { "product" : "toaster", "store number" : 2, "quantity" : 50 }, { "product" : "toaster", "store number" : 3, "quantity" : 50 }, { "product" : "blender", "store number" : 3, "quantity" : 100 }, { "product" : "blender", "store number" : 3, "quantity" : 150 }, { "product" : "socks", "store number" : 1, "quantity" : 500 }, { "product" : "socks", "store number" : 2, "quantity" : 10 }, { "product" : "shirt", "store number" : 3, "quantity" : 10 } ] let $join := for $store in $stores[], $sale in $sales[] where $store."store number" eq $sale."store number" return { "nb" : $store."store number", "state" : $store.state, "sold" : $sale.product } return [$join]

wscsprint3r commented 6 years ago

When running the NestsVars.iq test file on a real cluster, this happens. Nests might be broken completely.

screenshot from 2018-02-05 18-03-46

wscsprint3r commented 6 years ago

org.apache.spark.SparkException: Job aborted due to stage failure: Failed to serialize task 3, not attempting to retry it. Exception during serialization: java.io.NotSerializableException: sparksoniq.jsoniq.item.IntegerItem Serialization stack:

CanBerker commented 5 years ago

Query now succeeds and returns: [ { "nb" : 1, "state" : "MA", "sold" : "broiler" }, { "nb" : 1, "state" : "MA", "sold" : "socks" }, { "nb" : 2, "state" : "MA", "sold" : "toaster" }, { "nb" : 2, "state" : "MA", "sold" : "toaster" }, { "nb" : 2, "state" : "MA", "sold" : "socks" }, { "nb" : 3, "state" : "CA", "sold" : "toaster" }, { "nb" : 3, "state" : "CA", "sold" : "blender" }, { "nb" : 3, "state" : "CA", "sold" : "blender" }, { "nb" : 3, "state" : "CA", "sold" : "shirt" } ]

ghislainfourny commented 5 years ago

Marvellous! So good to see so many issues solved. You did a great job with the local FLWORs.