richardartoul / nola

MIT License
74 stars 6 forks source link

File store that is agnostic #55

Closed gedw99 closed 1 year ago

gedw99 commented 1 year ago

https://github.com/hack-pad/hackpadfs

Will allow an actors to run in browsers or on servers ( Wazero etc or golang ). Fits Nola perspectives …

Indexdb in a browser can be garbage collected at any time by the OS . So you can use it expecting it to always there

It can also call out to S3 buckets

See examples :)

With a nats kv server feeding it , it can always catchup off local OS, Intranet ( NATS leaf ) or Internet ( Nats Cloud ).

just exploring options :)

what ya think ?

richardartoul commented 1 year ago

So interestingly I think the abstraction I want baked into NOLA is a blob storage abstraction not a filesystem abstraction. I was planning on adding that but have not found time yet. It could still be backed by memory or indexeddb as well.

I like that because I think it’s almost as equally usefully, but it’s a much narrower interface to implement.

What do you think?

gedw99 commented 1 year ago

am on mobile so forgive the meat stick typos !!

Indexed db in browser is shakey. OS can always garbage collect the actual indexdb. But that’s life.

If the Source of Truth of the blobs is S3 it’s fine, because the garbage collected Browser Indexdb will just refill based on Control Plane subscription. Basically a KV store on the sever of what blobs an Actor should have. How do we know that ? Because we recorded the CUD ( create, update , deletes ) OPS at the Control Plane level into a very durable place. Where that place is is up for debate …

On native I bet there is a golang one. Hackpadfs has the API. Just need to code an implementation for native . Not too hard. Would prefer to make it cross platform, not just Linux.

Reminds me that in another issue I was crapping in about File capability Security - maybe a User or Operator wants to limit the max size of an A tors indexdb after all. Just like in a browser.

Then adapter for S3 . Hackpad already has it for both wasm and wasi ( native ).

They we have synchronisation. Hashing so that your calls to S3 are cached via indexdb. If Actor has the freshest one then return from Index DB, otherwise get it from S3.

Mutations go to S3 first . Or do write to indexdb optimistically ? Not sure. Large blobs won’t fit :) maybe decide based in size.

It’s like a CQRS setup with S3 being the write side and indexdb being the read side.

NATS is sort of made for these setups imho. You can push data into it and consumers ( index db ) instances get the updates. Minio has notifications via NATS too, so if your minio data lane is altered your Actors will get that update. Very important as not everything is in the Actirs world - there are all the legacy systems also mutating your minio data lane.

hell was even thinking of doing binary diff so that changes to S3 are delta diffed, with only the delta flowing through nats to the consumer indexdb. Was working on that as a POC the other day. It worked but have not done much more on it.

I know I am using the NATS hammer because everything looks like a nut. It’s my weakness.

i already have blobs flowing through nats now with consumers getting the blobs delivered to them. I think I benched 20 Gb a few days ago. Because it’s Store and forward under the hood , network flakiness does not affect the outcome.

this leads me onto bus. Have a bus that can do in memory, in worker, a Ross same Machine processes and across clouds and edges using NATS.

richardartoul commented 1 year ago

Closing this for now since there is nothing actionable and not planning on doing this anytime soon