RumbleDB / rumble

⛈️ RumbleDB 1.22.0 "Pyrenean oak" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
http://rumbledb.org/
Other
213 stars 82 forks source link

Refactor Iterator Materialization #98

Closed CanBerker closed 3 years ago

CanBerker commented 6 years ago

Originally posted by @ghislainfourny in https://github.com/Sparksoniq/sparksoniq/pull/85/files/c1e80ea7bcd7ee19f4ad0f7a9d67e93972746e1a

CanBerker commented 5 years ago

@ghislainfourny . This issue was created as per your request to refactor iterator materialization in the top level iterator class. The comment above gives full details. In the current implementation, this function is implemented in RuntimeIterator:

https://github.com/Sparksoniq/sparksoniq/blob/12210e01129cfd6e3fad03abd71f5c18e11ac1b9/src/main/java/sparksoniq/jsoniq/runtime/iterator/RuntimeIterator.java#L117

What kind of a change did you have in mind? Could you please elaborate.

ghislainfourny commented 5 years ago

I think what I had in mind was adding a public method to the Iterator class that materializes an iterator:

List materialize(DynamicContext context)

This would simply make this functionality official and clean.

This function should ideally return an error if the iterator is already open (for this we may need to add an isOpen() method and a private field that keeps track of the "openness" of the iterator).

ghislainfourny commented 3 years ago

I think this is all addressed with various functions materializeAtMostOneItem, materializeOneItemOrNull, etc.