dask / helm-chart

Helm charts for Dask
https://helm.dask.org/
92 stars 91 forks source link

Error: No such file or directory as /home/joyvan/<filename.csv #75

Closed ghost closed 4 years ago

ghost commented 4 years ago

I'm using the jupyterlab provided by the dask Helm chart to experiment with dataframes. I uploaded a spreadsheet I downloaded from Kaggle (NYC Parking Tickets) and renamed it Parking_2017_csv.

When I do the following everything is fine:

import dask.dataframe as dd
from dask.diagnostics import ProgressBar 
from matplotlib import pyplot as plt     
df = dd.read_csv('Parking_2017.csv')    
df

However, when I perform the following, I get an error that there is no such file as /home/joyvan/Parking_2017.csv even though the file is in the directory.

missing_values = df.isnull().sum()
missing_count = ((missing_values / df.index.size) * 100)
with ProgressBar():   
        missing_count_pct = missing_count.compute()
missing_count_pct

Also, is there a way I can use Jupyter Notebook instead of Jupyter Lab in the helm chart?

jacobtomlinson commented 4 years ago

/home/joyvan is only available on the notebook pod and cannot be accessed by the Dask workers. You need to load data from a network-accessible location.

For accessing the classic notebook UI you can change /lab to /tree in your browser URL.

ghost commented 4 years ago

thank you for your help

ghost commented 4 years ago

I apologize, I would like to re-open this issue because I need some help how to mount a local directory to the helm directory that the notebook can see. Do you have examples?