Open RunshengSong opened 4 years ago
Hi @RunshengSong, to use ProtobufResponseRowDeserializer
with pyspark-sdk, the constructor accepts a StructType
instead of a string: https://github.com/aws/sagemaker-spark/blob/master/sagemaker-pyspark-sdk/src/sagemaker_pyspark/transformation/deserializers/deserializers.py#L30
You will need to build a StructType
(https://spark.apache.org/docs/1.1.1/api/python/pyspark.sql.StructType-class.html) that contains the feature column field and feed it to the ProtobufResponseRowDeserializer
constructor.
Hi @ChuyangDeng , thanks for the reply. I understand that I need to send a StructType
as the schema to ProtobufResponseRowDeserializer
, which is already the case in the code I provide above.
However, the problem I was asking is the protobufKeys attribute. When I don't send this parameter, it gives me an NPE when I display the Dataframe of prediction output.
What should be correct type of protobufKeys attribute?
Thanks again.
Please fill out the form below.
System Information
Describe the problem
I have the following code in pyspark trying to to construct a
SageMakerEstimator
for a random cut forest image:When I run this code using PySpark, I got the following error:
The problem is in the
ProtobufResponseRowDeserializer
. According to the source code of this object for Scala, it should accept aSeq
.What is the correct counterpart in PySpark? Obviously it doesn't accept a list of string.
I tried to search the
sagemaker-spark-sdk
and I couldn't find any reference there.