fslaborg / Deedle

Easy to use .NET library for data and time series manipulation and for scientific programming
http://fslab.org/Deedle/
BSD 2-Clause "Simplified" License
941 stars 197 forks source link

appendN is buggy #139

Closed adamklein closed 10 years ago

adamklein commented 10 years ago

@tpetricek @hmansell

There is a bug in createVectorListDispatcher (which was created for appendN functionality), and I’m struggling to fix it. I believe I need to build a Linq expression to dynamically convert an IVector list to an IVector<T> list.

You can see the problem by running the test AppendN works on non-primitives in my branch https://github.com/BlueMountainCapital/Deedle/tree/ak.fixAppendN

I had used the function createVectorDispatcher from VectorHelper.fs as a template, and I tried to follow the pattern and simply add use of list type constructor, but it yields the error message:

System.InvalidOperationException : No coercion operator is defined between types 'Microsoft.FSharp.Collections.FSharpList`1[Deedle.IVector]' and 'Microsoft.FSharp.Collections.FSharpList`1[Deedle.IVector`1[System.Decimal]]'.

The problem is Expression.Convert. What I believe I need to do is construct a different linq expression to map the dynamically generated IVector -> IVector<’T> conversion expression over the IVector list argument, to yield a IVector<’T> list. Any ideas how to do this?

This is critical for unnest to work better, as right now it appends all the nested frames via a pairwise reduce – as right now, it’s O(N^2).

adamklein commented 10 years ago

After some blood, sweat, tears - and looking into internal representation of stuff - I think this is working! I'm going to create a PR (#140) for review.

adamklein commented 10 years ago

For a benchmark - it used to take 1m7s to unnest 173 frames with 35 columns to 7274 rows, now takes closer to 35s. This is still abominable... need to get this faster!

tpetricek commented 10 years ago

The stuff with dispatchers is nasty :-( I feel guilty for adding that to the codebase! I think there might be a better way of dealing with it - but did not have much time to look into this yet. I'll see if I can have a look when 1) I get back home 2) I get my laptop fixed (currently does not work much..) Though that might not really help with the fancier dispatchers like the one needed here.