OpenGPTX / docs

Documentation for platform users
0 stars 0 forks source link

Spark / DeltaLake Support for KF #5

Closed rauerhans closed 1 year ago

rauerhans commented 2 years ago

In order to work with the KF deployment, we require a data management solution. As discussed this should be a combination of Spark / DeltaLake on top of S3.

Requirements:

rauerhans commented 2 years ago

Tim commented: spark delta lake works interactivly with 1 executor pod.

with 2 or more executor pods, a connection between each other it not possible yet…

IRSA testing is needed as the next step

rauerhans commented 2 years ago

Tim commented: IRSA testing/debugging/finding a solution was done yesterday

All infos can be found here: [https://discordapp.com/channels/880830238723047424/893111923778928640/941372492893790219|https://discordapp.com/channels/880830238723047424/893111923778928640/941372492893790219|smart-link]

Automation needed from plural side. Opened a github issue: [https://github.com/pluralsh/plural-artifacts/issues/159|https://github.com/pluralsh/plural-artifacts/issues/159|smart-link]