kedro-org / kedro-starters

Templates for your Kedro projects.
Apache License 2.0
64 stars 59 forks source link

kedro ipython immediately fails for starter=spaceflights-pyspark-viz #210

Closed gpierard closed 8 months ago

gpierard commented 8 months ago

Description

installed the requirements.txt, getting this immediately after kedro ipython kedro.io.core.DatasetError: Class 'spark.SparkDataset' not found, is this a typo?

is there any additional config needed for starter=spaceflights-pyspark-viz?

I tried this on Windows with 0.19.1 and 0.19.2 both yielding the same issue. kedro-datasets is 1.5.1

datajoely commented 8 months ago

Answered on stackoverflow as well, but would be easiest to track here

if you explicitly install pip install kedro-datasets[spark.SparkDataset] does it work? This should be part of the requirements.txt

merelcht commented 8 months ago

Hi @gpierard, thanks for flagging this issue. Could you provide some more info so it's easier for the team to find out what's going on?

Thanks!

astrojuanlu commented 8 months ago

IIUC from https://github.com/kedro-org/kedro/issues/3545#issue-2095153118, Kedro 0.17.6 and Python 3.8.5, am I right @gpierard?

Any chance you can upgrade to a newer version? 0.17.6 is very old.

gpierard commented 8 months ago

thanks for your answers, sorry I should have mentioned that I tried this on Windows with 0.19.1 and 0.19.2 both yielding the same issue. kedro-datasets is 1.5.1 and kedro-datasets[spark.SparkDataset] is installed as well. This issue is fully separate from the others mentioned.

merelcht commented 8 months ago

@gpierard The datasets were only renamed in kedro-datasets 1.7.0 so before that it would be looking at spark.SparkDataSet. Can you try using the latest version 2.0.0?

gpierard commented 8 months ago

I confirm that kedro-datasets 2.0.0 solves the issue and that the spark session is correctly configured. thanks.