kaiko-ai / typedspark

Column-wise type annotations for pyspark DataFrames
Apache License 2.0
65 stars 4 forks source link

I don't understand how to create schema if the schema has nested fields #528

Closed Grottersha123 closed 1 week ago

Grottersha123 commented 1 week ago

I have schema like this and I didn't find examples in documentation.

schema = StructType(
[        StructField('name', StringType(), True),
          StructField('type', StructType(
                [
                    StructField('id', IntegerType(), True),
                    StructField('name', StringType(), True)
                ]
            )
                        , True),
        ,StructField('employmentTeamRelations', ArrayType(
                StructType(
                    [
                        StructField('id', IntegerType(), True),
                        StructField('employment', StructType(
                            [
                                StructField('id', StringType(), True),
                                StructField('login', StringType(), True),
                                StructField('employeeCard', StructType(
                                    [
                                        StructField('id', IntegerType(), True)
                                    ]
                                )
                                            , True),
                                StructField('linearCostCenter', StructType(
                                    [
                                        StructField('code', StringType(), True)
                                    ]
                                )
                                            , True),
                            ]
                        )
                                    , True),
                        StructField('role', StructType(
                            [
                                StructField('id', IntegerType(), True),
                                StructField('name', StringType(), True)
                            ]
                        )
                                    , True),
                        StructField('profile', StructType(
                            [
                                StructField('id', IntegerType(), True),
                                StructField('name', StringType(), True)
                            ]
                        )
                                    , True),
                        StructField('startDate', StringType(), True),
                        StructField('endDate', StringType(), True),
                        StructField('actual', BooleanType(), True),
                        StructField('budgetSource', StructType(
                            [
                                StructField('id', IntegerType(), True),
                                StructField('name', StringType(), True)
                            ]
                        )
                                    , True),
                        StructField('teamCostCenter', StructType(
                            [
                                StructField('id', StringType(), True)
                            ]
                        )
                                    , True)
                    ]
                ),
            )
                        , True)
]
nanne-aben commented 1 week ago

Hi @Grottersha123 ! I think this is what you're looking for, right?

Grottersha123 commented 1 week ago

Yes, thank you very much !

nanne-aben commented 1 week ago

You're welcome!