datacontract / datacontract-cli

CLI to manage your datacontract.yaml files
https://cli.datacontract.com
Other
352 stars 60 forks source link

Add Spark SructType exporter #277

Closed pierre-monnet closed 4 days ago

pierre-monnet commented 1 week ago

Add the ability to export Spark StructType.

simonharrer commented 1 week ago

I forgot to update the CHANGELOG. Can you rebase your changelog entry so that it appears in the unreleased section again, please?

simonharrer commented 1 week ago

Thanks for the PR. Looks nice. My question: should the output not be model dependent? So one would have to call the export per model, to get a structtype for that model? Or is it typical to put multiple struct types into a parent dict? I'm raising this as I'm thinking regarding the integration of our CLI tool with other tools, and with this, one would always has to extract a specific structtype again for further processing.

pierre-monnet commented 6 days ago

Thanks for the PR. Looks nice. My question: should the output not be model dependent? So one would have to call the export per model, to get a structtype for that model? Or is it typical to put multiple struct types into a parent dict? I'm raising this as I'm thinking regarding the integration of our CLI tool with other tools, and with this, one would always has to extract a specific structtype again for further processing.

I changed the SparkExporter to return a string "picture" of python code generated. I keep the function that returns a dict for programmatic usage

simonharrer commented 4 days ago

Fine for me. We should, however, document, that we basically generate python code here that uses pyspark.

pierre-monnet commented 4 days ago

Fine for me. We should, however, document, that we basically generate python code here that uses pyspark.

I added this detail in the README

simonharrer commented 4 days ago

Thanks! :-)