RumbleDB / rumble

⛈️ RumbleDB 1.22.0 "Pyrenean oak" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
http://rumbledb.org/
Other
213 stars 82 forks source link

Problem on Windows reading multiple files at once. #1243

Open misodle opened 1 year ago

misodle commented 1 year ago

Trying to read and process multiple json files in a directory. If reference just 1 file I can echo it back OK.

This code referencing 1 file works fine. let $doc := json-file("file:///C:/repos/Content1Utils/resultset/test/result-set-1682362680670.json") let $result := { "gtinList": for $row in $doc for $i in $row.items[] return {
"gtin" : $i.item.gtin } } return $result with this command java -jar -Xmx1g rumbledb-1.21.0-standalone.jar --materialization-cap 1000 run ExampleQuery11.txt

This example is from the Ramble website which says it can read a directory.

for $my-json in json-file("/absolute/directory/file-*.json") where $my-json.property eq "some value" return $my-json

My code

for $my-json in json-file("file:///C:/repos/Content1Utils/resultset/test/result*.json") where $my-json.items[].objId eq "712652354" return $my-json

Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method) at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:793) at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1218) at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1423) at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:601) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1972) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2014) at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:761)

This is on windows 10.

I have tried many variations and can't seem to read more than 1 file at a time.