Closed apesaDS closed 3 weeks ago
this will be partially fixed by https://github.com/scylladb/scylla-migrator/tree/spark-3.1.1 it adds https://github.com/scylladb/spark-dynamodb/commit/978c0cd7e49f99b976d0192db0a027d573f83991 which has inferSchema=false but it'd need to be exposed in migrator, too, for above to work
Thx, what would it take to expose this in mIgrator?
@apesaDS @tarzanek What would be the current status here? It seems like taking pretty time? Do we have any ETA on it? Thanks guys!
@apesaDS @tarzanek Do we have any progress/updates here? Thanks!
cc: @DoronArazii
plan is to fix it with next spark 3 version which lifts this limit, for now increasing the limit is the workaround
btw. the quick (and dirty) patch in module spark-dynamodb is
diff --git a/src/main/scala/com/audienceproject/spark/dynamodb/datasource/DynamoDataSourceReader.scala b/src/main/scala/com/audienceproject/spark/dynamodb/datasource/DynamoDataSourceReader.scala
index fdf6e27..6052595 100644
--- a/src/main/scala/com/audienceproject/spark/dynamodb/datasource/DynamoDataSourceReader.scala
+++ b/src/main/scala/com/audienceproject/spark/dynamodb/datasource/DynamoDataSourceReader.scala
@@ -94,7 +94,7 @@ class DynamoDataSourceReader(parallelism: Int,
})
val typeSeq = typeMapping.map({ case (name, sparkType) => StructField(name, sparkType) }).toSeq
- if (typeSeq.size > 100) throw new RuntimeException("Schema inference not possible, too many attributes in table.")
+ if (typeSeq.size > 150) throw new RuntimeException("Schema inference not possible, too many attributes in table.")
StructType(typeSeq)
}
fwiw after above I was able to reproduce the issue with converting of BigDecimal to Decimal but for that I don't have any fix yet
I believe this issue has been fixed since we don’t infer the table schema anymore. Please let me know if you are still blocked.
When migrating from DynamoDB to Scylla Alternator Migrator fails to create table in Scylla if DynamoDB table is larger than 100 attributes In DynamoDataSourceReader.scala, line 97 if (typeSeq.size > 100) throw new RuntimeException("Schema inference not possible, too many attributes in table.") I changed typeSeq.size > 150 It gets past this exception but fails to creaate the table I need in Scylla. Seems to fail in NameTools.scala in this case statement case Some(KeyspaceSuggestions(keyspaces)) => s"Couldn't find table $table in $keyspace - Found similar keyspaces with that table:\n${keyspaces.map(k => s"$k.$table").mkString("\n")}"