Open astroseger opened 5 years ago
I just want to clarify I think the .snet
directory should be per project. Just like .git
. Projects will be fixed to a particular published service spec, so we don't want different projects interacting with the same cache.
This .snet
directory would still be used by SDK and CLI, but wouldn't be on the home directory.
The reason for this is that if I build an application using SNet, I should be able to bundle it up and all dependencies should be inside the project directory. I don't want to have to extract compiled protobuf/grpc files from a home directory, and I especially don't want to have to filter and find only the files related to my current project (if I'm working on multiple applications).
Related issue is here: singnet/snet-cli#173 which makes it clear it should be per project.
Essentially the same as issue singnet/snet-cli#214 Choosing the right strategy will also solve for singnet/snet-cli#219 as the functionality of invoking a service using just the service-id and org-id can be provided by the SDK itself.
My preference is the second option proposed by Vitaly, which essentially has the snet-cli built around the SDK
snet-client-common ^ | snet-client-sdk ^ |
---|
snet-cli
Lets align on the approach and we can figure out the implementation. Eventually would like for all out client tools to be built around the SDK so that we eat our own dog food. Thoughts @vforvalerio87 @pennachin @ferrouswheel @astroseger @arturgontijo @vsbogd @ksridharbabuus
I think final solution should be discussed among with discussing API which sdk
provides to client. There should be some difference between specific and generic client implementation. Obviously client code for specific service (generated by sdk
) can directly use service API. And snet-cli
code should make generic calls. I think we need to describe and discuss sdk
API first and then discuss how to implement snet-cli
using it.
My thoughts on other items raised by @astroseger in issue description:
Items 1, 3, and 5 sounds reasonable to me.
- Mechanism for dynamical caching everything
I think sdk
should NOT restrict client in a way of storing caching information. Instead I would provide some generic interface which can be implemented by client to use cache storage of his choice. To simplify client writing I would also provide file storage implementation which can be in turn reused in snet-cli
. I would make storage path to be a parameter of this implementation to allow client choosing file path to store cache.
- Small libraries: utils_agi2cogs.py utils_agi2cogs.py utils_agi2cogs.py, some functions from utils.py
It is difficult to say right now, but sounds like a good candidate for snet-client-common
package
@raamb
I fully agree with @vsbogd that we should discuss API which SDK provides.
I agree that it would be nice to rewrite snet-cli using SDK. But with current API it is "not practical".
I still think that general strategy should be as described in the first post of this issue! We should extract existed functions and functionality from snet-cli. I can help with it, after we define API.
I would make storage path to be a parameter of this implementation to allow client choosing file path to store cache.
As long as this has a sensible default (e.g. file cache in a project's .snet
directory) and isn't yet another configuration parameter to think about.
Customisation should take a back seat to "Everything Just Works" with minimal cognitive overhead for a developer trying to use SNet.
Customisation should take a back seat to "Everything Just Works" with minimal cognitive overhead for a developer trying to use SNet.
Sure, just would like to emphasize that sdk
should be more flexible than current snet-cli
implementation.
For example I don't think that sharing cache between snet-cli
and sdk
is the best option, as it raises question about version compatibility for instance. As developer I would prefer having two different directories to not bother whether sdk
and snet-cli
versions I use are compatible. I don't insist they should be different by default, but I think we definitely need to provide such option.
2. I agree that it would be nice to rewrite snet-cli using SDK. But with current API it is "not practical".
I see your point on the practicality of doing this.
3. I still think that general strategy should be as described in the first post of this issue! We should extract existed functions and functionality from snet-cli. I can help with it, after we define API.
Sure, as you suggest lets work on pulling out common functionality into a common dependency for both the client and the SDK. 1,3,4 and 5 are clear candidates for this.
With regards to the API that the SDK will support it should
I doubt that any of this is new and pardon if this is at a very high level.
I think in the long run having the CLI built on top of the SDK is a good idea, but I also agree that's not the best short term path. We should plan for that in the medium term.
I also agree with this:
For example I don't think that sharing cache between snet-cli and sdk is the best option, as it raises question about version compatibility for instance. As developer I would prefer having two different directories to not bother whether sdk and snet-cli versions I use are compatible. I don't insist they should be different by default, but I think we definitely need to provide such option.
About the "sdk cache" issue:
I think first and foremost, the sdk should be completely stateless by default. You can't assume the user who runs it will have a reliable filesystem at his disposal (fargate cluster, AWS lambda, whatever).
That being said, I proposed having a "pluggable cache" system where you basically specify what persistence engine you want to use, which could be things that the SDK understands like "filesystem" (based on pickle in the Python iteration of the SDK, for example), or "cli" (which means, "I have e CLI installed on my system, please leverage that for caching", which is the same thing that the AWS SDKs do if you have an AWS CLI installed and configured on your system... but you can still specify whatever authentication options you prefer and override that), or provide a class with conventional methods that do serialization and deserialization (example: persistance=DatabaseStorage or something, which would be a class with by convention a "save" and "load" method that would serialize and deserialize the cache for you).
This way the developer who uses SNET CLI and has it configured could use that in his (Python) client application, while the user who runs his service on whatever distributed docker cluster could do persistency on an external database / nfs mount / dynamoDB / whatever so if he destroys / re-creates containers they can all still access and leverage the same cache for the whole cluster.
There's an issue open that specifically regards this which is #7
By the way, I think we should have the same approach with authentication.
Meaning: you could just specify "CLI" as an authentication method (as an alternative to "private_key" or "mnemonic" or whatever), at which point the SDK would look at your CLI configuration and use whatever authentication method you specificed there as the default identity.
Lets get started with the refactor effort as proposed by Sergey and look to build a common dependency layer across SDK and CLI
@vsbogd @vforvalerio87 I think your slightly over complicate things about cache. I want to underline that it is rather cheap to find channels (~5 sec) and it is rather cheap to download metadata and recompile protobuf (~5 sec). We obviously shouldn't do it at each call, or even at each application start, but there is no reason to make something complicated here.... We already have implementation of dynamical caching in snet-cli and directory can be made optional if you want...
I will extract the following functionality from snet-cli
org_id/service_id
(with dynamical caching)As a first step I will simply isolate these functions inside snet-cli without putting them in separate repo. @raamb Do you agree?
(I've removed item 3 from my previous post because it is irrelevant now... )
As a first step I will simply isolate these functions inside snet-cli without putting them in separate repo. @raamb Do you agree?
Agree with this. We can start pulling out code.
- @vsbogd @vforvalerio87 I think your slightly over complicate things about cache. I want to underline that it is rather cheap to find channels (~5 sec) and it is rather cheap to download metadata and recompile protobuf (~5 sec). We obviously shouldn't do it at each call, or even at each application start, but there is no reason to make something complicated here.... We already have implementation of dynamical caching in snet-cli and directory can be made optional if you want...
The point here is that the caching strategy should be pluggable. The first version would be the one you describe of dynamically downloading to a project specific directory but the solution should be able to support other options as required. The point @vforvalerio87 made about Lambda is quite valid as we have seen interest in some Telegram groups on building out services on a serverless framework. We should be able to support any persistent store which is what the pluggable approach affords.
Same... literally everybody I spoke to asked how to set it up with Lambda, and nobody will have the snet cli installed on the server where the service runs. Having the sdk share configuration / cache with the cli is just for convenience for developers who test locally on their computer.
I agree about extracting all the code you mentioned so it can be used in both places.
@vforvalerio87 @raamb Sorry guys, but I still don't quite get why the existed solution with dynamical caching is not enough for your... You simply specify directory where cache is stored. It is totally fine works in Docker (cache will be rebuild on fly after restart)....
In Docker, yes. In Lambda cache would be lost at every call (same with Azure functions or Google Cloud Functions)
Cool. But I assume that filesystem is a common way to represent information in linux. Isn't it?
If there is no means to have any persisted storage in lambda. Then the only one solution is to have "static" cache. So you simply create container with all information (including cached channels and compiled protobufs) inside. If this information became outdated it will be automatically replaced in each call but there is no magical solution for it.. Isn't it?
In any case we already support it...
Ah yes... Am I right that you are saying that as a persisted storage in lambda is better to use some kind of DB and not a filesystem?
As I was suggesting above and in issue #7, there is a very simple solution: the user provides a class with a pre-defined interface which takes care of serializing and deserializing the application state.
Example:
from snet import Snet
from state_cache import MySQLCache
snet = Snet(persistence_engine=MySQLCache, ...)
MySQLCache is a custom class provided by the user which must expose at least the following methods: save
, load
. save
takes a key (example: "logs") and dict, load
takes a key (example: "logs") and returns a dict.
So then in the snet_sdk when I want to cache information I'll do for example: self.persistence_engine.save("logs", logs)
or self.persistence_engine.load("logs")
In __init__
for the class I'll check if the class provided by the user implements the methods I require with the signature I expect. If user specifies nothing, we use the file system.
With this we can also offer the user the possibility to avoid asking the channel state from the daemon completely, making calls faster. You store the state locally, and even if you have multiple machines using the same private key it's not an issue because they use the same storage for things like channel state, so if client1 makes a call and the state is updated, client2 can take the information from the same storage without asking the daemon all the time. It's 1 grpc call every service call instead of 2.
If something happens on the daemon side, for example the service provider makes a claim, we should have a background process (async) in the SDK that gets the logs and the channel information from the blockchain every time a new block is mined.
@vforvalerio87
With this we can also offer the user the possibility to avoid asking the channel state from the daemon completely, making calls faster. You store the state locally, and even if you have multiple machines using the same private key it's not an issue because they use the same storage for things like channel state, so if client1 makes a call and the state is updated, client2 can take the information from the same storage without asking the daemon all the time. It's 1 grpc call every service call instead of 2.
Unless you use some locking mechanism, that is how you get race conditions ;-)
SDK
lacks some features which already exist insnet-cli
(dynamical caching of everything (channels/service metadata/compiled protobuf), time based strategies for gas price, support of different types of identities, etc). Instead of re-implemented all these features inSDK
we should isolated them in separate library and use them inSDK
and insnet-cli
together.General strategy should as following: if we need some functionality in
SDK
and this functionality is already presented insnet-cli
we should take it fromsnet-cli
. If we don't like how it is implemented insnet-cli
we re-implement it, but we also replace old functionality insnet-cli
.@vforvalerio87 @raamb do you agree with this approach?
Functionality which already present in snet-cli, but lucking in SDK
identity.py
(@vforvalerio87 if you don't like how it is implemented it should be re-implemented in such a way that snet-cli could also use it)utils_agi2cogs.py
utils_agi2cogs.py
utils_agi2cogs.py
, some functions fromutils.py
mpe_client_command.py