fslaborg / Deedle

Easy to use .NET library for data and time series manipulation and for scientific programming
http://fslab.org/Deedle/
BSD 2-Clause "Simplified" License
924 stars 196 forks source link

SampleInto with c# complex object #476

Closed GitHubAre closed 5 years ago

GitHubAre commented 5 years ago

Hello,

I have been reading documentation back and forth and I havent managed to figure out how to use SampleInto on complex object. I am trying to compute different ticker from lower ticker f.e. from 1 hour ticker I would like to compute 4 hour ticker by merging rows within 4hour TimeSpan. I have an object that is persisted into database with these properties:

    public class TimeserieRecord : PersistentObject
    {
        public TimeserieRecord()
        { }

        public int TsTradingPairId { get; set; }

        public TsTradingPair TsTradingPair { get; set; }

        public string TimeFrameName { get; set; }

        public decimal Open { get; set; }

        public decimal High { get; set; }

        public decimal Low { get; set; }

        public decimal Close { get; set; }

        public decimal Volume { get; set; }

        public DateTimeOffset TimeStamp { get; set; }
    }

I have created Frame as such:

            var frame = Frame
                .FromRecords(dbRecords)
                .IndexRows<DateTimeOffset>(nameof(TimeserieRecord.TimeStamp))
                .SortRowsByKey();

But then I get lost. I have a frame with columns and on some columns (f.e. Volume) I would like to perform Sum(), on columns like Low I would like to perform Min(), on Open I would do First() etc.

Help highly appreciated!

zyzhu commented 5 years ago

Some codes like this should work

Console.WriteLine("Sum of Column Volume: {0}", frame.GetColumn<Decimal>("Volume").Sum());
Console.WriteLine("Min of Column Low: {0}", frame.GetColumn<Decimal>("Low").Min());
Console.WriteLine("First of Column Open: {0}", frame.GetColumn<Decimal>("Open").FirstValue());
GitHubAre commented 5 years ago

Thanks for quick response. Unfortunately this would result in single row being aggregation of whole Frame, but in reality I have one Frame where I need to resample data into higher (time wise) ticker e.g. 1hour into 4 hours, where I need to merge 4rows into one providing new frame that contains the grouping.

I was hoping to use SampleInto() method.

zyzhu commented 5 years ago

The fsharp version is like this

let frameGroupBy = frame |> Frame.groupRowsByIndex(fun x -> x.Date, x.Hour / 4)
frameGroupBy?Volume|> Stats.levelMean Pair.get1Of2
frameGroupBy?Low |> Stats.levelSum Pair.get1Of2
frameGroupBy?Open |> Series.applyLevel Pair.get1Of2 Series.firstValue

The csharp version is quite verbose.

Func<DateTime, Tuple<DateTime, int>> everyFourHours = 
    timeStamp => new Tuple<DateTime, int>(timeStamp.Date, timeStamp.Hour / 4);
var frameGroupBy = frame.GroupRowsByIndex(everyFourHours);            
var columnVolume = frameGroupBy.GetColumn<Decimal>("Volume");
var columnLow = frameGroupBy.GetColumn<Decimal>("Low");
Func<Tuple<Tuple<DateTime, int>, DateTime>, Tuple<DateTime, int>> 
    pickFirst = x => x.Item1;
Console.WriteLine("Sum of Column Volume: {0}", columnVolume.MeanLevel(pickFirst));
Console.WriteLine("Sum of Column Low: {0}", columnLow.SumLevel(pickFirst));
Console.ReadLine();

However, I couldn't find the ApplyLevel function that is supposed to be exposed. Let me dig into it as it might be a bug.

zyzhu commented 5 years ago

Alternatively you can achieve the result with Resample function

fsharp version

let keys =
  frame.RowKeys
  |> Seq.groupBy(fun x -> x.Date, x.Hour / 4)
  |> Seq.map(fun (_, timeStamps) -> timeStamps |> Seq.head)
let sumVolume =
  frame?Volume |> Series.resampleInto keys Direction.Forward (fun _ s -> Stats.mean s)
let minLow =
  frame?Low |> Series.resampleInto keys Direction.Forward (fun _ s -> Stats.min s)
let firstOpen =
  frame?Low |> Series.resampleInto keys Direction.Forward (fun _ s -> Series.firstValue s)

csharp version

var keys = frame.RowKeys.GroupBy(timeStamp => 
    new Tuple<DateTime, int>(timeStamp.Date, timeStamp.Hour / 4));
var keyHeads = keys.Select(key => key.First());
Func<DateTime, Series<DateTime, Decimal>, Decimal> minFunc = (date, series) => series.Min();
Func<DateTime, Series<DateTime, Decimal>, Double> sumFunc = (date, series) => series.Sum();
Func<DateTime, Series<DateTime, Decimal>, Decimal> firstFunc = (date, series) => series.FirstValue();
var sumVolume = frame.GetColumn<Decimal>("Volume").Resample(keyHeads, Direction.Forward, sumFunc);
var minLow = frame.GetColumn<Decimal>("Low").Resample(keyHeads, Direction.Forward, minFunc);
var firstOpen = frame.GetColumn<Decimal>("Open").Resample(keyHeads, Direction.Forward, firstFunc);