Starcounter / Home

Starcounter is an in-memory database application engine.
https://starcounter.io
27 stars 1 forks source link

Deprecate transient properties of database classes #309

Closed miyconst closed 6 years ago

miyconst commented 7 years ago

Currently we support transient properties inside database classes as described in this issue - Preventing Fields From Becoming Database Columns.

There is a big problem with it.

Every read of a database reference returns a new CLR instance and this resets the transient values.

[Database]
public class Company {
    public string Name { get; set; }

    [Transient]
    public string Note { get; set; }
}

[Database]
public class Person {
    public string Name { get; set; }
    public Company WorksAt { get; set; }
}

public class Program() {
    public static void Main() {
        Person p = Db.FromId<Person>(1);

        p.Name = "Konstantin";
        p.WorksAt = new Company() { Name = "Starcounter" };
        p.WorksAt.Note = "until tomorrow.";

        // Prints "Konstantin works at Starcounter "
        // Every p.WorksAt access returns a new CLR instance
        Console.WriteLine($"{p.Name} works at {p.WorksAt.Name} {p.WorksAt.Note}");

        Company c = new Company();

        c.Name = "Starcounter";
        c.Note = "is a Swedish company";

        // Prints "Starcounter is a Swedish company"
        Console.WriteLine($"{c.Name} {c.Note}");
    }
}

We should either return the same CLR class on every reference read, or deprecate and eventually remove the transient properties inside the database class, otherwise it's going to hit us hard.

cc @warpech, @alemoi, @bigwad, @linkdata, @k-rus

linkdata commented 7 years ago

Never heard of that feature before. What's the use case and is anyone actually using it?

miyconst commented 7 years ago

I have never herd of someone actually using those as well, but PALMA tried to use it, and figured out this issue.

warpech commented 7 years ago

This functionality can be achieved with language features. such as Dictionary or ConditionalWeakTable? In my shallow understanding of why is this needed, I think this feature is redundant.

linkdata commented 7 years ago

Deprecate it. If noone complains after a few releases, remove it.

miyconst commented 7 years ago

This functionality can be achieved with language features. such as Dictionary or ConditionalWeakTable? In my shallow understanding of why is this needed, I think this feature is redundant.

@warpech you are absolutely right. There is no much extra value in this feature.

bigwad commented 7 years ago

There is no much extra value in this feature.

I think there is. The current implementation is, of course, inadequate, but I don't see it that trivial to implement such 'tagging' feature on a user level. Taking into consideration transaction isolation, conflict detection and garbage collection (I guess all these properties are actually expected by user code).

We still can deprecate it if we don't want to support it.

bigwad commented 7 years ago

I think @per-samuelsson might know design decisions behind it.

per-samuelsson commented 6 years ago

Every read of a database reference returns a new CLR instance and this resets the transient values.

Yes, that is by design. We don't want to cache proxies.

I think @per-samuelsson might know design decisions behind it.

It was by request years and years ago, and was used to do some computing and store values in instance variables if I remember it correctly. It has never been used to any great extent to my knowledge, and I'm fine with deprecating / removing it.

I can do it once someone take the decision what to do in what versions. I guess that's up to @miyconst. Like make it obsolete in 2.3.1 and dropping it in 2.4 or whatever. Just let me know.

miyconst commented 6 years ago

Decided by me to deprecate the feature in 2.3.1 and remove it in 2.4. @per-samuelsson please proceed.

erikvk commented 6 years ago

I occasionally use transient properties in my (Mopedo's) apps. They're, as @per-samuelsson mentioned, useful if instances of a database class are used for computation before being "dropped" to the database. During this computation, they may have other objects, possibly non-database objects, that are closely tied to them, that will not be relevant when the computation is over. It's commonly useful to assign these objects to transient properties of the database object.

However, I agree that the notion of transient properties is somewhat confusing and that they can introduce strange behavior.

per-samuelsson commented 6 years ago

Thanks for sharing @erikvk.

Also, I think Palma had some use-case - true @miyconst? Would be interesting to at least know what that was.

Anyway, I'm progressing as @miyconst decided if no one tell me otherwise.

eriksunsol commented 6 years ago

Some thoughts from a PALMA perspective...

(I would discourage using transient properties to store intermediate computation results, but depending on the application it might make sense. Usually I would consider the need for that a sign that the algorithms should be simplified or restructured to eliminate the need. Or wrap the objects in another non-database object during the calculations and keep intermediate results there.)

One practical use of [Starcounter.Transient] I can come up with is to cache things in the CLR instances, e.g. to address that each read of a property referencing a database object results in a new CLR instance:

    [Database]
    public class Foo
    {

    }

    [Database]
    public class Bar
    {
        public Foo _foo;

        [Starcounter.Transient]
        public Foo _fooCache;

        public Foo Foo
        {
            get
            {
                if (_fooCache == null)
                    _fooCache = _foo;

                return _fooCache;
            }
            set
            {
                _foo = value;
                _fooCache = value;
            }
        }
    }

This would be better to solve in the weaver though and avoid all that boiler plate in application code.

Combine that with the more obvious scenario when a getter runs at least one Db.SQL() statement, let’s say Foo is developed further:

    [Database]
    public class Foo
    {
        public int someValue;

        [Starcounter.Transient]
        private bool? _isMax;

        public bool IsMax
        {
            get
            {
                if (!_isMax.HasValue)
                    _isMax = !Db.SQL<int>("SELECT f.someValue FROM Foo f").Any(v => v > someValue);

                return _isMax.Value;
            }
        }
    }

Now if for some reason you want to access bar.Foo.IsMax inside a loop the performance gain from this kind of caching can be significant.

So, if the caching example from Foo can be implemented in the weaver, we do still get different CLR instances each time we run Db.SQL(), which I think is reasonably easy to accept. Then [Starcounter.Transient] is not needed for the Foo cache scenario, but it would still be needed for the Bar scenario. In a discussion last week between @miyconst, @dan31 and myself we talked about a more aspect oriented way to implement the Bar case. I'm not sure exactly what it would look like, but it would be a means to add application-defined caching of more complex aggregations possibly with many queries being run. Then [Starcounter.Transient] will not be needed to implement caching, but with all this in place, [Starcounter.Transient] would not be so confusing anymore, and the reason to deprecate it in the first place can be questioned.

Bottom line is, here's my suggested sequence of fixing things:

  1. (Don't deprecate [Starcounter.Transient].)
  2. Implement caching of object references in CLR instances in the weaver.
  3. Showcase use of [Starcounter.Transient] for caching query results and aggregations.
  4. Implement user-definable caching.
  5. Showcase other uses of [Starcounter.Transient].

I may be wrong, but implementing the first one I would guess is not a huge effort. I realize there may be other considerations against it, such as CLR objects using up more RAM. On the other hand given the use case described then RAM usage will actually be less because there will be fewer instances wasted on the heap. The second one is just documentation and examples. If you stop there the situation will be much better than what it is now, and there is no real hurry to implement 3 and 4.

3 is not specific to running a Starcounter database, and in fact I could argue that it is not even needed as there is a commonly used pattern for it anyway. Number 1 is different because currently the weaver modifies the behavior of standard fields and auto-properties in a way that no one would normally expect. What looks like C# should behave like C#!

miyconst commented 6 years ago

@eriksunsol you have a point, and you suggestion is valid, but unfortunately we currently don't have resources to implement it.

@per-samuelsson please proceed with deprecating transient database properties.

eriksunsol commented 6 years ago

@miyconst @per-samuelsson please stop and think about it just one more minute.

As is apparent from the discussion above, few people are even aware of [Starcounter.Transient]. Majority of those who are (outside Starcounter) probably found the documentation page (I did). That page does not currently explain the caveats, namely what the life cycle of CLR instances looks like, and since the life cycle is non-intuitive, people get confused. Add proper explanations of CLR instance life cycle to the docs and people will get less confused.

Not perfect but maybe good enough.

Also, if [Starcounter.Transient] is deprecated and then removed, it will be impossible to implement caching in user code. For PALMA this is something actually being considered. So my question is: What do I get instead that lets med do that? And if the answer is "nothing", then please put deprecation of [Starcounter.Transient] on hold until that question has a more satisfactory answer.

miyconst commented 6 years ago

@eriksunsol let's talk offline on Monday. @per-samuelsson please postpone this issue due to the high interest from @eriksunsol.

per-samuelsson commented 6 years ago

@eriksunsol let's talk offline on Monday.

Seem like a good plan.

@per-samuelsson please postpone this issue due to the high interest from @eriksunsol.

Will do. I'll keep a watch on this thread on how to, and when, to proceed.

miyconst commented 6 years ago

Today me and @eriksunsol has a discussion and agreed that:

Those, we do not deprecate it in 2.3.1, but may deprecate it later. Putting the issue on hold.

miyconst commented 6 years ago

After another discussion between @eriksunsol, @dan31 and me, it was decided not to deprecate the transient properties, but rather document it's behavior to avoid confusion.