Open leandromoh opened 6 years ago
TraverseDepthFirst
can help with this to a large extent. For example, your recursive types example can be written like this:
var example1 =
from c in categories
where c.Parent == 0
select c into root
from e in
MoreEnumerable.TraverseDepthFirst(
(Category: root, Depth: 0),
p => from c in categories
where c.Parent == p.Category.Id
select (Category: c, Depth: p.Depth + 1))
select new string('-', e.Depth * 2) + e.Category.Name;
foreach (var e in example1)
Console.WriteLine(e);
And the non-recursive one like so:
var example2 =
from f in fligths
where f.Departure == "A"
select f into a
from e in
MoreEnumerable.TraverseDepthFirst(
(Flight: a, Depth: 0),
p => from f in fligths
where f.Departure == p.Flight.Arrival
select (Flight: f, Depth: p.Depth + 1))
select new string('-', e.Depth * 2) + e.Flight.Departure;
foreach (var e in example2)
Console.WriteLine(e);
So I'd argue that TraverseDepthFirst
is a workaround, though I recognise that I'm cheating here somewhat by only looking at the printed output of each example (which is a flat list of lines again).
To go from a flat list back to hierarchy will require more thinking and work. The signature:
static IEnumerable<object> UnFlatten<T, TResult>(this
IEnumerable<T> source,
Func<T, bool> rootSelector,
Func<T, IEnumerable<T>> childrenSelector,
Func<T, IEnumerable<object>, TResult> resultSelector)
has several problems:
rootSelector
is really a predicate so should be called rootPredicate
but this is just a minor naming issue.childrenSelector
predicate says that T
should already know about its children but it doesn't. In your examples you re-traverse the source sequence but since it's not available to the function, you rely on lambda closures to provide it as context. The signature should really be Func<T, IEnumerable<T>, IEnumerable<T>>
, although this requires a full scan on each call to find the children and so will be inefficient.resultSelector
a sequence of objects?I'd also like to add that this method isn't the opposite of Flatten
even if it feels so. While Flatten
traverses a nested sequence in depth-first order and lays them out flat, what I see being proposed here is dereferencing of a flat list linked via references into a tree. Correct me if I'm misreading.
I'd also like to add that this method isn't the opposite of Flatten even if it feels so. While Flatten traverses a nested sequence in depth-first order and lays them out flat, what I see being proposed here is dereferencing of a flat list linked via references into a tree. Correct me if I'm misreading.
Dereferencing of a flat list linked via references into a tree is just a specific usage of this operator. Here is an example that looks more with Unflatten
.
var numbers = MoreEnumerable.Sequence(2, 5, 2)
.Select(x => (decimal) x);
IEnumerable result =
numbers.UnFlatten(_ => true,
n =>
n.ToString().Length == 1 || n.ToString().Split(',')[1].Length < 3
? new[] { n - (n / 2), n + (n / 2) }
: Enumerable.Empty<decimal>()
,
(parent, children) => children.Prepend(parent));
PrintObject(result, 0);
static void PrintObject(IEnumerable source, int deep)
{
foreach (object i in source)
{
if (i is IEnumerable)
PrintObject((IEnumerable) i, deep + 1);
else
Console.WriteLine(new string('-', deep * 2) + i);
}
}
outputs
--2
----1
------0,5
--------0,25
----------0,125
----------0,375
--------0,75
----------0,375
----------1,125
------1,5
--------0,75
----------0,375
----------1,125
--------2,25
----------1,125
----------3,375
----3
------1,5
--------0,75
----------0,375
----------1,125
--------2,25
----------1,125
----------3,375
------4,5
--------2,25
----------1,125
----------3,375
--------6,75
----------3,375
----------10,125
--4
----2
------1
--------0,5
----------0,25
------------0,125
------------0,375
----------0,75
------------0,375
------------1,125
--------1,5
----------0,75
------------0,375
------------1,125
----------2,25
------------1,125
------------3,375
------3
--------1,5
----------0,75
------------0,375
------------1,125
----------2,25
------------1,125
------------3,375
--------4,5
----------2,25
------------1,125
------------3,375
----------6,75
------------3,375
------------10,125
----6
------3
--------1,5
----------0,75
------------0,375
------------1,125
----------2,25
------------1,125
------------3,375
--------4,5
----------2,25
------------1,125
------------3,375
----------6,75
------------3,375
------------10,125
------9
--------4,5
----------2,25
------------1,125
------------3,375
----------6,75
------------3,375
------------10,125
--------13,5
----------6,75
------------3,375
------------10,125
----------20,25
------------10,125
------------30,375
To go from a flat list back to hierarchy will require more thinking and work. The signature has several problems:
- rootSelector is really a predicate so should be called rootPredicate but this is just a minor naming issue.
- The childrenSelector predicate says that T should already know about its children but it doesn't. In your examples you re-traverse the source sequence but since it's not available to the function, you rely on lambda closures to provide it as context. The signature should really be Func<T, IEnumerable
, IEnumerable >, although this requires a full scan on each call to find the children and so will be inefficient. - Why return a sequence of objects?
- Likewise, why is the second argument (tree?) to resultSelector a sequence of objects?
point 1: okay, make sense.
point 2: I see. There are cases where T does not know about its children (like you pointed) and cases where it does (like in my example with numbers, where we generate the children). Anyway, user can materialize source before call Unflatten
to avoid many interations if one consider it expensive.
point 3: I tried
static IEnumerable<TResult> UnFlatten<T, TResult>(this
IEnumerable<T> source,
Func<T, bool> rootSelector,
Func<T, IEnumerable<T>> childrenSelector,
Func<T, IEnumerable<TResult>, TResult> resultSelector)
but except when the recursivity is inside the type (like Category
), TResult
will become recursive. In my example with non-recursive types, if I had used a class named Flight
instead of an anonymous, the type signature would be:
IEnumerable<Tuple<Flight, IEnumerable<Tuple<Flight, IEnumerable<Tuple<Flight, ...>>>>>>
and therefore un-writeble. As well Flatten
can handle with arbitrarily depth of nested sequences, Unflatten
also does. Also, compiler can not infer TResult
type.
point 4: this parameter has the same type of the return type of Unflatten
, since the operator is recursive.
Can I Go ahead with PR considering point 1and 2?
I am afraid we have not fleshed this out enough. I still have to come back to you with some comments on your last reply.
Here is my Unflatten
for 2-level hierarchy.
public static IEnumerable<(T? Key, T[] Children)> Unflatten<T>(this IEnumerable<T> source, Func<T, bool> predicate) {
var key = default( T );
var hasKey = false;
var children = new List<T>();
foreach (var item in source) {
if (!predicate( item )) {
children.Add( item );
} else {
if (hasKey || children.Any()) yield return (key, children.ToArray());
key = item;
hasKey = true;
children.Clear();
}
}
if (hasKey || children.Any()) yield return (key, children.ToArray());
}
Flatten
operator takes a sequence containing arbitrarily-nested sequences and return a flatten one but sometimes we need to do the inverse operation, that is, given a flatten sequence, we need to create branches from the elements. So I propose theUnFlatten
operator.Signature:
Examples:
1) With Recursive Types
outputs:
2) With Non-Recursive Types
outputs:
Workaround
None.
Prototype
I would like to submit a PR for this. What do you think?