Closed jarutis closed 9 years ago
Hi Jonas,
Cubert does not currently support Hive-style partitioned folder organization. If you wish to read all data, you can try:
load "/path/to/avro/daily/year=/month=/day=/country=" using AVRO;
hope that helps.
Best, -Maneesh
On Sun, Dec 7, 2014 at 7:30 AM, Jonas Jarutis notifications@github.com wrote:
Hi,
I've tried loading avro files with the following stucture:
/path/to/avro/daily/year=2014/month=12/day=05/country=de/de-r-00000.avro
Using the following script:
JOB "job1" REDUCERS 50; MAP { input = LOAD "/path/to/avro" USING AVRO; } ... END
But I get the following error:
[Dependency Analyzer] Program inputs: [/path/to/avro]
Cannot compile cubert script. Exiting. java.lang.RuntimeException: java.io.IOException: there are no files in /path/to/avro at com.linkedin.cubert.analyzer.physical.DependencyAnalyzer.exitProgram(DependencyAnalyzer.java:277) at com.linkedin.cubert.analyzer.physical.PhysicalPlanWalker.walk(PhysicalPlanWalker.java:75) at com.linkedin.cubert.analyzer.physical.DependencyAnalyzer.rewrite(DependencyAnalyzer.java:91) at com.linkedin.cubert.ScriptExecutor.rewrite(ScriptExecutor.java:319) at com.linkedin.cubert.ScriptExecutor.main(ScriptExecutor.java:481) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.io.IOException: there are no files in /path/to/avro at com.linkedin.cubert.utils.AvroUtils.getSchema(AvroUtils.java:71) at com.linkedin.cubert.io.avro.AvroStorage.getPostCondition(AvroStorage.java:109) at com.linkedin.cubert.analyzer.physical.DependencyAnalyzer.getPostCondition(DependencyAnalyzer.java:309) at com.linkedin.cubert.analyzer.physical.DependencyAnalyzer.exitProgram(DependencyAnalyzer.java:262) ... 9 more
Everything works perfectly fine if I load de-r-00000.avro file directly. But not if I point to the directory with partitions.
— Reply to this email directly or view it on GitHub https://github.com/linkedin/Cubert/issues/4.
This works, thanks.
Hi,
I've tried loading avro files with the following stucture:
Using the following script:
But I get the following error:
Everything works perfectly fine if I load
de-r-00000.avro
file directly. But not if I point to the directory with partitions.