tanmaykm / Blobs.jl

Facilitate distributed out of core computation over blobs of data.
Other
2 stars 1 forks source link

Question about abstract matrix blobs #4

Open datnamer opened 8 years ago

datnamer commented 8 years ago

Hello,

I stumbled upon this package and was intrigued by its implementation of the abstract matrix interface.

What are the implications of this? Does this mean a blob backed by an arbitrary out of core or distributed datasource can be passed successfully any linear algebra, vectorized and iterative functions accepting abstract matrix? ie GLM etc

Or, do these routines have to be specifically specialized for these blob arrays?

Thanks!

tanmaykm commented 8 years ago

The example abstract matrix implementation is quite limited at this moment. It should be possible to pass this to simple functions. Some of the routines you mentioned will likely need to be specialized, often to get good performance on blob arrays. Plugging this into something like DistributedArrays/ComputeFramework could let us make use of the existing functionality there.

datnamer commented 8 years ago

Interesting thanks. So is it just a matter of performance?

Also would this work for something like a single node, on disk sqlite or PostgreSQL db?

tanmaykm commented 8 years ago

@datnamer the recently added ProcessGlobalBlob make it easy to use this on a single node and as a generic cache that can probably be layered over a sql engine as well.

datnamer commented 8 years ago

Thanks I'll check it out.