dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.06k stars 2.03k forks source link

Storage providers contract is hard to test and tightly coupled with Orleans runtime. #2115

Closed centur closed 2 years ago

centur commented 8 years ago

This is an issue as a follow up of two things - our attempt to implement own "safe" blob provider (lil bit incompartible with an existing one) and the discussion with @ReubenBond about design principles and IStorageProvider interface's public contract. So may not be always coherent :).

The problem is that IStorageProvider takes hard dependency on GrainReference class which, based on it's implementation and visibility considered pretty internal and coupled to Orleans Runtime.

public interface IStorageProvider : IProvider
  {
    Logger Log { get; }
    Task ReadStateAsync(string grainType, GrainReference grainReference, IGrainState grainState);
    Task WriteStateAsync(string grainType, GrainReference grainReference, IGrainState grainState);
    Task ClearStateAsync(string grainType, GrainReference grainReference, IGrainState grainState);
  }

I can see 2 problems with this interface

  1. Logger is a real class, not an abstraction ( but that's fine, it's 'pluggable' so one can fake it easy)
  2. GrainReference is tightly coupled. This is a bigger bummer.

When we tried to unit test StorageProvider in isolated environment (with real Azure backed storage but outside of Orleans Runtime) - we can't construct GrainReference intance. Moreover - to test it properly, we need to construct GrainReference with a certain properties - so we can test interactions of our Storage Provider implementation with this particular instance.

There is no other valid way to construct it, via from public static GrainReference FromKeyString(string key) - but this is a chicken and egg problem - we need a valid key or at least an insight on how to make it (internal implementation details from instance method GrainReference.ToKeyString()

From the perspective of StorageProvider - these details are unnecessary, and Storage provider needs only a way to get some metadata about the Grain - Grain real type, unique ID that is stable for this grain type and maybe something else.

Wouldn't it be better to change (this is a breaking change) this type to an interface which provides access to this information about the grain instance ? E.g.

public interface IGrainIdentity
{
  string GetGrainType()
  string GetGrainUniqueId()
  string GetGrainFooDetails()
  string GetGrainBazDetailsFor(Context)
}

These parts are from the discussion (don't want to re-type it)

this is kind of a violation of OOP - if it's part of the public contract of an interface - it must be independently constructed. I can't even mock it for my purposes - no parameterless ctor and making of FakeGrainRef:GrainReference leaves me to chicken and egg problem - protected ctor accepts only another GrainReference or Serialization context stuff... I can get one via static GrainReference.FromKeyString but this is again - chicken and egg - I can't use it because it relies on internal implementation - how can I write the state to test public ReadStateAsync if I can't fake it's location...

Reuben Bond >Orleans could provide utils for mocking that type?

Making it an interface would be a valid path, but interface methods may be a bit quirky...

Reuben Bond> rather avoid that if possible Reuben Bond> I think it would be better to just give it a protected default constructor Reuben Bond> and to make the public methods virtual. Or extract a limited interface from it. Or (probably better) refactor that code to use a GrainId and expose GrainId

...well, making it in the public contract and making implementer of this contract is definitely not a very good thing. I'd say it OCP violation - system is closed for an easy and testable extension of it. But everyone can read principles differently. Short story - public contract is tightly coupled to Orleans runtime internals and implementation details so it's impossible to test pluggable component without plugging it in. And if it's plugged in - it's hard to simulate bad cases and see how component reacts on it.

Orleans may provide some aiding utils, but that doesn't removes the issue itself, it just eases the pain a bit. Which is great short term solution but bad in the long term - dull pain will never be sharp enough to say - "f... that, it stinks and must be changed".

Reuben Bond > Yeah, the point of that parameter is the grain id anyhow, so either we should provide that as a string or as a GrainId class I'm all for changing things, especially for 2.0.

The reality that the dependency on a GrainReference and GrainType (as a string) is to allow implementer uniquely identify the grain it's being called for and write\read data there. It's much better to have a specific interface with breadths of methods to retrieve various 'meta' about the grian.

while all these are nice - this doesn't help me to solve an actual issue I have - to test Storage provider in isolated and mocked environment, cause I can't construct a mandatory input for it. I don't have any other ways except take an existing KeyString and hardcode it into my tests so I can call GrainReference.FromGrainKey() . Not sure how any utils besides this method would help.


Any thoughts ? Ideas ? Opinions ?

ReubenBond commented 8 years ago

IGrainIdentity currently exists as:

public interface IGrainIdentity
{
  Guid PrimaryKey { get; }
  long PrimaryKeyLong { get; }
  string PrimaryKeyString { get; }
  string IdentityString { get; }
  long GetPrimaryKeyLong(out string keyExt);
  Guid GetPrimaryKey(out string keyExt);
}

Information about which of those properties will throw and which will not throw is unknowable with that interface alone - maybe we should change that and then refactor the provider interfaces to leverage that instead.

centur commented 8 years ago

Also to add a bit - it'd be nice if Storage provider can access the data to original state type so we can avoid coupling code with data (see my comment on #2105 for details).

centur commented 8 years ago

@ReubenBond this why it'd be nice to simplify it down to a single signature interface. String is pretty ubiquitos. As well as access to original Grain type is required - different grain types may use same state objects, especially if it's some kind of a generic structure

sergeybykov commented 8 years ago

I think your observation is correct - grain references aren't very test friendly. One complication is that they are expected to be produced by GrainFactory.GetGfain() that maps the requested grain interface to an implementation class and its type code, which is defined by the type map that client obtains from a silo gateway upon initialization.

I think to make this functionality properly testable we might need to make type resolution injectable and GrainClient initializable in isolation. But even that isn't enough - we need a way to produce a grain reference for a given grain interface pointing to a desired grain class implementing it.

centur commented 8 years ago

I'm not sure I understand how injectable GrainClient would help with this, can you please elaborate on this ? Why external provider should know about any of these injections or mock it for tests ?

Technically provider needs these details only to map a request to a single possible stored data entry, which is expected to be of a particular serialized type - this can be done by a very specific interface for storage contract - IGrainIdentity probably wasn't the best name I picked above, let's call it IStoredGrainIdentity to clearly separate it from an existing IGrainIdentity

sergeybykov commented 8 years ago

I meant that to test the current provider interface one needs to be able to construct correct grain references, which is only possible today via GrainFactory.GetGrain() and requires a type map to be initialized.

centur commented 8 years ago

Do we really need a correct GrainReference ? I'm curious to know how can it be used besides extracting Grain unique id from it (which is just a string produced by grainRefObj.ToKeyString()) Current storage provider is unaware about the context it's being run in - it receives info about unique grain, which is used to find a unique stored data and then it deserializes it as an object to a passed IGrainState.State. It can't even know what type it just created - this information is inside the serialization mechanism and the string data (which is not great also, but it's isolated from provider)

Actually this is the question about some plans for providers 2.0 - are they planned to be more knowledgeable about the context they are being run in (know the type of the state, be aware about concept of the grain etc) or they are going to delegate certain hooks to grains so we can have methods invoked by provider on the grain instance ?

veikkoeeva commented 8 years ago

@centur Some information quickly, I had this problem while working with AdoNetStorageProvider.

Specifically AdoNetStorage provider uses this algorithm to extract the primary key from GrainReference. The AdoNetStorageProvider is currently tested differently from the other providers. Specifically how to obtain GrainReferences for tests you can look at TestDataSets. It has utility function to generate various kinds of references, random or otherwise (here). There has been discussion about introducing a special kind of a testkit so tha providers could be tested independently of Orleas, discussion at https://github.com/dotnet/orleans/issues/2011.

As noted, the AdoNetProvider is tested currently differently than the others. A large part of it is that I tried to move to a direction that makes it easier to providers independent of Orleans, but also make testing storage providers easier in general. If you are interested, the higher level description on how I solved is here (look Testing). Specifically it might be possible you take take a module/subtree dependency on Orleans and use the test sets and utilities in the test project and do the following:

Copy-paste SqlServerStorageTests and rename appropriately. You don't have to inherit from RelationalStorageTests, but inherit the fixture. CommonFixture needs to be modified, it's the place to "ask for instances", a kind of poor-man's DI for XUnit.NET. When one asks for instance, the TestEnvironmentInvariant ensures the environment is set up or if not, a null connection will be eventually returned and the tests will be skipped. In your specific case it would use the class that tries to start an emulator or it could try a connection the actual Azure instance (there's an override for the settings via JSON). Now, some of this is half-way there, for this to work as-is, CommonFixture and TestEnvironmentInvariant would need to be modified. Also the way the grainType is generated could be just a string template that is stamped with desired information.

As I see it, the problem in the current test setup is that that test classes do not set up their own invariants only, but also that of the environment. It leads to tighter decoupling that should be needed and hence makes it more difficult to work on the direction of the testkit.

<edit: Writing while running, but just wanted to note I felt I got closer to independence, but not all the way. Ran out of time etc. The referenced AdoNetProvider thread has also some notes that could go to Storage Provider 2.0.

<edit 2: The SQL Server and MySQL tests use CommonStorageTests underneath, the rest is just XUnit.NET specific parts using theories and data sets, so the core tests are independent of the testing framework (or almost, some used in asserts, but not mandatory).

centur commented 8 years ago

@veikkoeeva I see you've abstracted it and inside your provider you operates only with your domain class, not using grainReference - this is cool, GrainTypeGenerator - it's what I hope to avoid - with the same result I can extract and copy implementation from GrainReference.ToKeyString and mimic string generation in the way that they will be parseable by FromKeyString() implementation. But this overal approach hinted me on a way how I can simplify this a little bit - by injecting (into ctor-for-testing yuck) some simple interface which will do the same for my code - IGrainRefConverter. And in tests I can inject special mocked version, but inside the Silo it will be initialized with a real proper GrainRefConverter. Still this GrainRefConverter is pretty untestable, although it's much smaller surface than I have now. Thanks for the hint, @veikkoeeva.

Your GrainTypeGenerator is great and this may be what can be done for improving unit testability of the existing StorageProviders (without chaos of breaking changes or 2.0) - make it public and good enough to cover corner cases and keep compartibility with FromKeyString implementation. Thus consumers can do something like this sut.ReadStateAsync(GrainGenerator.GrainReferenceFor<IMyRealGrain<MyState>>)

PS: My issue here is not in setting up the test harness - it's already there, with fixtures etc, but in a stable way to generate test input without getting into the implementation depths.

veikkoeeva commented 8 years ago

@centur We should start and issue on storage providers 2.0 and see what should go inside of them. Alas, I'm currently too thin on time to write a coherent and a thoughtful ticket (but I will if no one gets there first).

One part of discussion could be a set of XUnit.NET class generators and a "core tests", which after implementing one could be assured the provider is "Orleans certified". Perhaps even so that popular ones would be tested as part of Microsoft stress test suite (written in certain way, so could be plugged in and setup done in certain way, such as giving storage string in uniform way).

A second part is that it would be beneficial for the application consumer to have a hold on the concrete implementation either when creating one, inside bootstrapper or via the management interface (or all of them). This way storage specific features could be exploited.

A third part is that we could perhaps decouple (de)serialization logic from the storage providers as done already in the ADO.NET one. It decouples a lot of things already, not only (de)serialization. By introducing canonical interfaces we could, I think, decouple all serialization logic and have DI just to inject them on whatever basis. Perhaps this could be done with a more generalized interceptor pipeline that is applied to multiple points in Orleans codebase. I had to resort to what I did since there isn't DI yet. :)

centur commented 8 years ago

And DI there to instantiate it

veikkoeeva commented 8 years ago

@centur

And DI there to instantiate it

Just edited my previous one to make a note about DI. Constructor injection on storage provider would be handy. Though the Init probably exists because Orleans creates the providers internally.

<small>I just moved the energy inside me, work calls... </small>

sergeybykov commented 8 years ago

I see two mostly independent efforts here:

  1. How to make the current storage provider API more testable
  2. Better storage provider API

Do we really need a correct GrainReference ? I'm curious to know how can it be used besides extracting Grain unique id from it (which is just a string produced by grainRefObj.ToKeyString())

For 1, I believe we do need to test with correct grain references. Otherwise, how do we know the provider will handle them correctly? 2 is a different question where we probably want to pass a more explicit identity instead of a grain reference.

ghost commented 2 years ago

This issue has been marked stale for the past 30 and is being closed due to lack of activity.