fsspec / gdrivefs

Google drive implementation of fsspec
BSD 2-Clause "Simplified" License
37 stars 18 forks source link

Doesn't seem to work with distributed. #8

Open rabernat opened 4 years ago

rabernat commented 4 years ago

I tried some basic stuff with a dask_kubernetes on ocean.pangeo.io. No luck.

I created a cluster and connected to it, created a gdrivefs, and the tried to read / write via xarray. I immediately get a KilledWorker.

Sorry for not providing a reproducible example. The only example I know how to make is probably too complicated. I figured you would know how to do a proper test of distributed instead of whatever hack I come up with.

martindurant commented 4 years ago

I haven't tried, but I assume you would need to copy the pickled token file to all workers.

On November 16, 2019 12:10:30 AM EST, Ryan Abernathey notifications@github.com wrote:

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/intake/gdrivefs/issues/8

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

martindurant commented 4 years ago

The following worked for me:

import fsspec
from gdrivefs import core
fsspec.registry['gdrive'] = core.GoogleDriveFileSystem

import dask.bag as db
from dask.distributed import Client
c = Client()
b = db.read_text('gdrive://*.md', storage_options={'token': 'cache'})
b.compute()

(in my case this was two files, and resulted in text output, as expected)