Open haggy opened 5 years ago
Hi haggy, thanks for raising this issue. This usecase is exactly the type of functionality eel looks to provide. We would be very welcome to you have a look at a fix, thanks.
Also, we'll look to get a new version out on 2.12 soon.
Thanks
@garyfrost I've started looking into this bug. Just an FYI, the build is currently failing due to a single test in the CsvSink
: ""support overwrite"
"[998e7733-fbba-4361-96fa-f92596031e5f].csv" was not equal to "[d7238986-3d82-463b-a716-0de0e115bbce].csv"
ScalaTestFailureLocation: io.eels.component.csv.CsvSinkTest$$anonfun$1$$anonfun$apply$mcV$sp$5 at (CsvSinkTest.scala:114)
Expected :"[d7238986-3d82-463b-a716-0de0e115bbce].csv"
Actual :"[998e7733-fbba-4361-96fa-f92596031e5f].csv"
<Click to see difference>
org.scalatest.exceptions.TestFailedException: "[998e7733-fbba-4361-96fa-f92596031e5f].csv" was not equal to "[d7238986-3d82-463b-a716-0de0e115bbce].csv"
at org.scalatest.MatchersHelper$.indicateFailure(MatchersHelper.scala:340)
at org.scalatest.Matchers$AnyShouldWrapper.shouldBe(Matchers.scala:6864)
at io.eels.component.csv.CsvSinkTest$$anonfun$1$$anonfun$apply$mcV$sp$5.apply(CsvSinkTest.scala:114)
at io.eels.component.csv.CsvSinkTest$$anonfun$1$$anonfun$apply$mcV$sp$5.apply(CsvSinkTest.scala:79)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.WordSpecLike$$anon$1.apply(WordSpecLike.scala:1078)
at org.scalatest.TestSuite$class.withFixture(TestSuite.scala:196)
at org.scalatest.WordSpec.withFixture(WordSpec.scala:1881)
at org.scalatest.WordSpecLike$class.invokeWithFixture$1(WordSpecLike.scala:1075)
at org.scalatest.WordSpecLike$$anonfun$runTest$1.apply(WordSpecLike.scala:1088)
at org.scalatest.WordSpecLike$$anonfun$runTest$1.apply(WordSpecLike.scala:1088)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
at org.scalatest.WordSpecLike$class.runTest(WordSpecLike.scala:1088)
at io.eels.component.csv.CsvSinkTest.org$scalatest$BeforeAndAfter$$super$runTest(CsvSinkTest.scala:15)
at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:203)
at io.eels.component.csv.CsvSinkTest.runTest(CsvSinkTest.scala:15)
it's more of a heads up. Im not going to dig into that as it's a pre-existing issue and Im not familiar with that piece of it
Im trying to use
JsonSource
and write it as Parquet usingParquetSink
. The issue I'm running into is that I get aClassCastException
whenever the Parquet writer encounters aRow
created from json with an array of objects. From what I see, it's because the writer is trying to cast a scalaMap
to aSeq[Any]
in theStructWriter
.Here is where the exception is thrown: https://github.com/51zero/eel-sdk/blob/v1.2.4/eel-core/src/main/scala/io/eels/component/parquet/RecordConsumerWriter.scala#L105
Here is the stack trace:
Here is a fully runnable example (will need hadoop deps if you don't already have them):
core-site.xml
file for local hadoop dev (I put this in my resources folder in ahadoop
sub directory):I do realize that this is on version
1.2.4
of eels but there is no later version for scala 2.12 on Maven. Can you see any reason why Im getting this writer error? If it's an actual bug I'd be happy to take a stab at fixing it.