EkaSe / Calculator

1 stars 1 forks source link

Collcetions: Common static methods #3

Closed acrm closed 7 years ago

acrm commented 7 years ago

To avoid making same querying methods in each collection type, needs to create class MyEnumarebleExtension, which contains following methods:

  1. public static IMyEnumerable< T> Where< T>(IMyEnumerable< T> collection, Func<T, bool> predicate)
  2. public static IMyEnumerable< T2> Select<T1, T2>(IMyEnumerable< T1> collection, Func<T1, T2> select) {...}
  3. public static T[] ToArray(IMyEnumerable< T> collection)
  4. public static MyList< T> ToList< T>(IMyEnumerable< T> collection)
  5. public static MyDictionary <K, V> ToDictionary<K, V>(IMyEnumerable< T> collection, Func<T, K> keySelector, Func<T, V> valueSelector) - creates from given T item a pair of key and value, getting them by applying corresponding selectors to the given item, and then populating adding them to a new MyDictionary instance.
  6. public static T FirstOrDefault< T>(IMyEnumerable< T> collection)
  7. public static T FirstOrDefault< T>(IMyEnumerable< T> collection, Func<T, bool> predicate)

Cover methods with tests, that demostraits different cases of their usage

UPD: all methods should be static

acrm commented 7 years ago

Additional task 1: Fluent style

Overview

It's good to have a class, containing all query-methods, which all depends only on IMyEnumerable interface of collections and doesn't depend on specific collections implementation. But using of such class looks redundantly because of repeatedly appearance of class name and generaly unnecessary intermediate result values.

var result0 = MyEnumerableExtension.ToDictionary <int, int, string> (collection, arg => arg, toBinAsString); var result1 = MyEnumerableExtension.Select (result0, countOnesInValues); var result2 = MyEnumerableExtension.Where <KeyValuePair <int, int>> (result1, arg => arg.Value < 3); var result3 = MyEnumerableExtension.Select <KeyValuePair <int, int>, int> (result2, arg => arg.Key); var result4 = MyEnumerableExtension.Where (result3, containsDigitSix);

It looks just like old-fanshion 'procedural programming', where data passed to procudures as parameters and returnes from them. Object-oriented concept of 'method' was designed to swap from 'data-to-procedure' paradigm to 'method-of-object' paradigm, to reduce number of passing as parameters data and to make code looks more natural.

Fortunatelly, we can obtain the advanteges of method-style over procedure-style without moving this logic back to collection classes. It is provided by extension methods support in C# language. It's based on usage of this keyword as modifier of first parameter of static methods and it allows to call such extension method as method of instance of specified type without moving code of this method inside those class definition. It helps to separate some logic from class, but use it with class instances in a natural way. Also it can be used to extend functionality of some class without modifying it at all.

Ok, it will help to avoid repeatedly using of class name. But what's about intermediate results? They can be avoided too just by chaining methods calls. It's possible because all methods have the same IMyEnumerable return value type, and also this type is the same as type of instances for which they can be called. It looks like this: var finalResult = collection.ToDictionary(...).Select(...).Where(...); For more conviniance methods calls in such chain separated by lines with leading dot-operator: var finalResult = collection .ToDictionary(...) .Select(...) .Where(...); Such style of methods calls and interfaces design that support such style of method calls named Fluent interface.

Task

References

Extension methods Fluent interface

acrm commented 7 years ago

Additional task 2: Extra extension methods

It also might be interesting to convert this code

int count = 0;
for (int i = 0; i < arg.Value.Length; i++) {
    if (arg.Value [i] == '1')
       count++;
}

return count;

into

return arg.Value
    .AsEnumerable()
    .Where(ch => ch == '1')
    .Aggregate(0, (item, accumulator) => accumulator++);

or even

return arg.Value
    .AsEnumerable()
    .Count(ch => ch == '1');

Try to comprehend semantic of both new methods, implement them, and use them in tests.

acrm commented 7 years ago

Additional task 3: Enumerating-on-demand instead of storing results

Overview

It is good, that you tried to avoid using MyList collection inside extension method:

Type collectionType = collection.GetType ();
var result = Activator.CreateInstance (collectionType);
var enumerator = collection.Enumerator;
while (enumerator.HasNext) {
    enumerator.Next ();
    if (predicate (enumerator.Current))
        ((IMyEnumerable<T>) result).Add (enumerator.Current);
    }
return (IMyEnumerable<T>) result;

/*var enumerator = collection.Enumerator;
MyList <T> result = new MyList<T> ();
while (enumerator.HasNext) {
    enumerator.Next ();
    if (predicate (enumerator.Current))
        result.Add (enumerator.Current);
    }
return (IMyEnumerable <T>) result;*/

But what is better than using same-type collection to storing result items is to get rid of storing anything in extra collections at all. Cause this storing will double memory usage and for large collections it may turn real problem. Actually, we do not need that result values are stored somewhere at this step. We only need, that they can be enumerated on demand. We can obtain it by creating custom wrapper-class for each extension method. This wrapper-classes will implement IMyEnumerable interface returning special enumerator-class, which holds reference on source collection and perform next item processing on calling Next() method. I. e. next result item will be generated only when it requested by caller code (e.g. iterating over our results in foreach).

Tasks

acrm commented 7 years ago

Additional task 4: Goodbye, IMyEnumerable!

Overview

Well, now we've got a set of query methods for collections, which can be chained in data-proccessing pipe in fluent style, and also all processing will be performed in lazy-manner - on demand, when results are requested. If we need only first five elements from result sequence - only those five elements will be calculated, even if the sequence is infinite. Instead of just creating result from source collection and delivering it to consumer we are creating result-generator and can use it in those context, where we need it. It is higher level of abstraction, and sometimes it will be useful.

But by now we've got this lazy generators only for Where() and Select() methods. Should we rewrite in the same manner others methods? For those methods, that return not-IMyEnumerable results (like ToArray() or Count()) it's no sense to do that. But if we add some more methods, which transform one IMyEnumerable to another IMyEnumerable (e.g. SkipElements(int count) or OrderAscending(Func < T, TKey > orderingKeySelector)) we will have to write special wrappers and enumerators again. It's not easy and makes implementation code looks more complicated.

Fortunately, C# support special statement - yield. Using this statement you can write your method in a simple cycle-style, without extra code, and all enumerator classes will be generated hidden from you by compilator. But there is constrain - return value of method should have type of IEnumerable. It means for us, that we should substitute by this interface all occurrences of our IMyEnumerable. It's not a big loss, the standard interface have exactly the same semantic, we just had avoided it first to practice with our own interface.

Tasks

Optional tasks

References

MSDN: yield Habrahabr: Yield: что, где и зачем

acrm commented 7 years ago

You can merge branch as soon as you are ready.