Closed acrm closed 7 years ago
It's good to have a class, containing all query-methods, which all depends only on IMyEnumerable interface of collections and doesn't depend on specific collections implementation. But using of such class looks redundantly because of repeatedly appearance of class name and generaly unnecessary intermediate result values.
var result0 = MyEnumerableExtension.ToDictionary <int, int, string> (collection, arg => arg, toBinAsString);
var result1 = MyEnumerableExtension.Select (result0, countOnesInValues);
var result2 = MyEnumerableExtension.Where <KeyValuePair <int, int>> (result1, arg => arg.Value < 3);
var result3 = MyEnumerableExtension.Select <KeyValuePair <int, int>, int> (result2, arg => arg.Key);
var result4 = MyEnumerableExtension.Where (result3, containsDigitSix);
It looks just like old-fanshion 'procedural programming', where data passed to procudures as parameters and returnes from them. Object-oriented concept of 'method' was designed to swap from 'data-to-procedure' paradigm to 'method-of-object' paradigm, to reduce number of passing as parameters data and to make code looks more natural.
Fortunatelly, we can obtain the advanteges of method-style over procedure-style without moving this logic back to collection classes. It is provided by extension methods support in C# language. It's based on usage of this keyword as modifier of first parameter of static methods and it allows to call such extension method as method of instance of specified type without moving code of this method inside those class definition. It helps to separate some logic from class, but use it with class instances in a natural way. Also it can be used to extend functionality of some class without modifying it at all.
Ok, it will help to avoid repeatedly using of class name. But what's about intermediate results? They can be avoided too just by chaining methods calls. It's possible because all methods have the same IMyEnumerable return value type, and also this type is the same as type of instances for which they can be called. It looks like this: var finalResult = collection.ToDictionary(...).Select(...).Where(...);
For more conviniance methods calls in such chain separated by lines with leading dot-operator:
var finalResult = collection
.ToDictionary(...)
.Select(...)
.Where(...);
Such style of methods calls and interfaces design that support such style of method calls named Fluent interface.
It also might be interesting to convert this code
int count = 0;
for (int i = 0; i < arg.Value.Length; i++) {
if (arg.Value [i] == '1')
count++;
}
return count;
into
return arg.Value
.AsEnumerable()
.Where(ch => ch == '1')
.Aggregate(0, (item, accumulator) => accumulator++);
or even
return arg.Value
.AsEnumerable()
.Count(ch => ch == '1');
Try to comprehend semantic of both new methods, implement them, and use them in tests.
It is good, that you tried to avoid using MyList collection inside extension method:
Type collectionType = collection.GetType ();
var result = Activator.CreateInstance (collectionType);
var enumerator = collection.Enumerator;
while (enumerator.HasNext) {
enumerator.Next ();
if (predicate (enumerator.Current))
((IMyEnumerable<T>) result).Add (enumerator.Current);
}
return (IMyEnumerable<T>) result;
/*var enumerator = collection.Enumerator;
MyList <T> result = new MyList<T> ();
while (enumerator.HasNext) {
enumerator.Next ();
if (predicate (enumerator.Current))
result.Add (enumerator.Current);
}
return (IMyEnumerable <T>) result;*/
But what is better than using same-type collection to storing result items is to get rid of storing anything in extra collections at all. Cause this storing will double memory usage and for large collections it may turn real problem. Actually, we do not need that result values are stored somewhere at this step. We only need, that they can be enumerated on demand. We can obtain it by creating custom wrapper-class for each extension method. This wrapper-classes will implement IMyEnumerable interface returning special enumerator-class, which holds reference on source collection and perform next item processing on calling Next() method. I. e. next result item will be generated only when it requested by caller code (e.g. iterating over our results in foreach).
Well, now we've got a set of query methods for collections, which can be chained in data-proccessing pipe in fluent style, and also all processing will be performed in lazy-manner - on demand, when results are requested. If we need only first five elements from result sequence - only those five elements will be calculated, even if the sequence is infinite. Instead of just creating result from source collection and delivering it to consumer we are creating result-generator and can use it in those context, where we need it. It is higher level of abstraction, and sometimes it will be useful.
But by now we've got this lazy generators only for Where() and Select() methods. Should we rewrite in the same manner others methods? For those methods, that return not-IMyEnumerable results (like ToArray() or Count()) it's no sense to do that. But if we add some more methods, which transform one IMyEnumerable to another IMyEnumerable (e.g. SkipElements(int count) or OrderAscending(Func < T, TKey > orderingKeySelector)) we will have to write special wrappers and enumerators again. It's not easy and makes implementation code looks more complicated.
Fortunately, C# support special statement - yield. Using this statement you can write your method in a simple cycle-style, without extra code, and all enumerator classes will be generated hidden from you by compilator. But there is constrain - return value of method should have type of IEnumerable. It means for us, that we should substitute by this interface all occurrences of our IMyEnumerable. It's not a big loss, the standard interface have exactly the same semantic, we just had avoided it first to practice with our own interface.
You can merge branch as soon as you are ready.
To avoid making same querying methods in each collection type, needs to create class MyEnumarebleExtension, which contains following methods:
Cover methods with tests, that demostraits different cases of their usage
UPD: all methods should be static