Open danlg opened 3 years ago
Yes. We should definitely improve documentation about this.
About your question, here are answers.
Plasma store is built to support zero-copy read. It stores objects in shared memory and use Apache Arrow to support zero-copy serialization among multiple processes
Hello @rkooo567 would you have any news on this one ?
@danlg Not much since I am a little oversubscribed lately. cc @suquark do you have some time to do this?
Hi,
adding to this that the current Ray docs state:
Plasma is an in-memory object store that is being developed as part of Apache Arrow
This is misguiding, as ray
"forked" from arrow due to reasons discussed here -
https://github.com/ray-project/ray/pull/7901
Even more confusing is that Arrow's Plasma will be deprecated in v10.0.0
-
https://arrow.apache.org/docs/python/plasma.html
I'd be happy to contribute a docs PR with the correction.
But, in case we'd like to add a "historical note" I'll need someone clarify the historical details ( @suquark ? ) -e.g. -
Who authored for the original Arrow Plasma Store ? ( clue here - https://lists.apache.org/thread/nw232k2lzmg9kcl8ts475m9ybl34j81p ) ?
Thanks, Harel
What is your question?
From the documentation https://arrow.apache.org/blog/2017/08/08/plasma-in-memory-object-store/ and https://arrow.apache.org/docs/python/plasma.html#starting-the-plasma-store
it is not clear if Plasma storage is distributed on several nodes or not.
I assumed in the example /tmp/plasma if a local FS not an NFS mounted on several hosts.
Please advise.
Also it would be interesting to describe how Plasma compares to Memcached