Open stephenholtz opened 4 years ago
hi @stephenholtz The ExternalTable
class and the storage functionality are designed for extensibility to other storage systems, including GCS. Adding GCS support would require only a small amount of development. #439 is currently addressed with tools specific to Globus and perhaps does not even need explicit DataJoint support.
This issue will track the implementation of GSC support for external storage.
@dimitri-yatsenko great! If I'm the only one working with google cloud buckets then I'll happily work on this -- starting next week I'll have some time to commit, so to speak.
Hi @stephenholtz Awesome! @chrisroat in Karl Deisseroth's lab was also looking into GSC support. He may have made progress.
I've been heavy into the underlying algorithms, as we changed our pipelines around somewhat. I haven't worked on this, sorry.
(sorry for the close/open)
I am currently using gcsfuse and a script to ensure mounting happens properly to get most of the functionality I wanted out of this system, but I remain interested in adding these features.
I am not sure how any additions I make would mesh with plans for adding an external storage plugin interface https://github.com/datajoint/datajoint-python/issues/762 and don't want to add work for you all restructuring whatever I come up with. Thoughts?
Waiting to see the shape of a plugin architecture is my current preference, but if it would be valuable to anyone except for me please let me know.
Currently
s3
allows manipulation of files in Amazon S3 stores, and theExternalTable
class has conditionals for local versus AWS locations. Something similar for Google Cloud Storage ought to be possible, in particular because Google also offers a reasonable Python 3 API.I'm not sure if any work for https://github.com/datajoint/datajoint-python/issues/439 will change how this is handled currently.