Hello, I was trying to ingest a Spark DataFrame containing a column with an ArrayType(FloatType,false) data type, but during the load_feature_definitions_from_schema operation, the following exception is thrown.
Py4JJavaError: An error occurred while calling o2339.loadFeatureDefinitionsFromSchema.
: software.amazon.sagemaker.featurestore.sparksdk.exceptions.ValidationError: Found unsupported data type from schema 'ArrayType(FloatType,false)' which cannot be converted to a corresponding feature type.
at software.amazon.sagemaker.featurestore.sparksdk.FeatureStoreManager.$anonfun$loadFeatureDefinitionsFromSchema$1(FeatureStoreManager.scala:143)
at scala.collection.IndexedSeqOptimized.foldLeft(IndexedSeqOptimized.scala:56)
at scala.collection.IndexedSeqOptimized.foldLeft$(IndexedSeqOptimized.scala:64)
at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:194)
at software.amazon.sagemaker.featurestore.sparksdk.FeatureStoreManager.loadFeatureDefinitionsFromSchema(FeatureStoreManager.scala:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:750)
Collection data types are supported for some feature group configurations by SageMaker Feature Store. The column in my feature group is defined like this.
Hello, I was trying to ingest a Spark DataFrame containing a column with an ArrayType(FloatType,false) data type, but during the load_feature_definitions_from_schema operation, the following exception is thrown.
Collection data types are supported for some feature group configurations by SageMaker Feature Store. The column in my feature group is defined like this.
Is there a better Spark data type to use than ArrayType(FloatType) or is this just not supported by the current implementation of this library?