dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.94k stars 1.86k forks source link

Improve performance of column cloning inside DataFrame arithmetics #6814

Closed asmirnov82 closed 9 months ago

asmirnov82 commented 10 months ago

The goal of this PR is to perform Arithmetics operation on columns with the same underlying data type approximately 3 times faster.

Detail of changes:

1) Fix PrimitiveColumnContainer Clone() method to use memory block coping for internal buffer instead of appending values one by one (with memory reallocation on each buffer resizing cycle). Do similar changes for CloneNullBitMapBuffers() method

2) Improve BinaryOperation.Implementation methods for all Arithmetic operations that happen not in place (default behavior). Before the change autogenerated code looked like this:

public partial class SingleDataFrameColumn
{
    internal SingleDataFrameColumn AddImplementation(SingleDataFrameColumn column, bool inPlace = false)
    {
        if (column.Length != Length)
        {
            throw new ArgumentException(Strings.MismatchedColumnLengths, nameof(column));
        }
        SingleDataFrameColumn newColumn = inPlace ? this : CloneAsSingleColumn();
        newColumn.ColumnContainer.Add(column.ColumnContainer);
        return newColumn;
    }
}

After PR https://github.com/dotnet/machinelearning/pull/6677 CloneAsSingleColumn can be changed to just this.Clone(). This allow to avoid unnecessary type conversion, that happens inside CloneAs... method and use fast Clone() method with bulk memory copy for internal buffers. For example. for Single:

internal PrimitiveColumnContainer<float> CloneAsFloatContainer()
{
 ...
    for (int i = 0; i < span.Length; i++)
   {
          newBuffer.Append(SingleConverter<T>.Instance.GetSingle(span[i]);
   }
}

3) Fix DataFrameBuffer constructor. DataFrameBuffer overrides parent ReadOnlyDataFrameBuffer ReadOnlyBuffer to return own new field _memory instead of parent _readOnlyBuffer (after this parent _readonlybuffer is ignored and never used). However in constructor _memory is not created, instead base constructor is called to allocate _readonlybuffer (which is ignored). So after creating Capacity of such buffer is always 0 (ignoring the actual parameter passed to the constructor) and additional memory is allocated

4) After 3 is fixed, changed code to use DataFrameBuffer constructor with capacity instead of creating empty dataframe buffer and than reallocating memory by calling EnsureCapacity


Result:

Simple tests for 1 million of rows:

[GlobalSetup]
public void SetUp()
{
    var values = Enumerable.Range(0, ItemsCount);
    _column1 = new Int32DataFrameColumn("Column1", values);
    _column2 = new Int32DataFrameColumn("Column2", values);
}

[Benchmark]
public void Sum()
{
    var column = _column1 + _column2;
}
Before PR:
| Method |    Mean |     Error |   StdDev |
|    Sum | 18.02 ms | 0.090 ms | 0.080 ms |

After PR:
| Method |     Mean |     Error |    StdDev |
|    Sum | 6.896 ms | 0.1363 ms | 0.1673 ms |

Part of #6824 issue

codecov[bot] commented 10 months ago

Codecov Report

Merging #6814 (1eb8a35) into main (d692751) will increase coverage by 0.01%. Report is 2 commits behind head on main. The diff coverage is 57.29%.

@@            Coverage Diff             @@
##             main    #6814      +/-   ##
==========================================
+ Coverage   68.99%   69.01%   +0.01%     
==========================================
  Files        1237     1237              
  Lines      253558   253556       -2     
  Branches    26542    26540       -2     
==========================================
+ Hits       174944   174984      +40     
+ Misses      71663    71616      -47     
- Partials     6951     6956       +5     
Flag Coverage Δ
Debug 69.01% <57.29%> (+0.01%) :arrow_up:
production 63.57% <48.75%> (+0.01%) :arrow_up:
test 88.86% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...rosoft.Data.Analysis/ArrowStringDataFrameColumn.cs 63.54% <100.00%> (ø)
...lysis/PrimitiveDataFrameColumn.BinaryOperations.cs 42.50% <100.00%> (+0.50%) :arrow_up:
test/Microsoft.Data.Analysis.Tests/BufferTests.cs 100.00% <100.00%> (ø)
...st/Microsoft.Data.Analysis.Tests/DataFrameTests.cs 99.31% <100.00%> (+<0.01%) :arrow_up:
...est/Microsoft.ML.Fairlearn.Tests/GridSearchTest.cs 100.00% <100.00%> (ø)
...icrosoft.Data.Analysis/PrimitiveColumnContainer.cs 86.13% <94.73%> (-0.41%) :arrow_down:
...Microsoft.Data.Analysis/ReadOnlyDataFrameBuffer.cs 48.71% <66.66%> (+1.34%) :arrow_up:
src/Microsoft.Data.Analysis/DataFrameBuffer.cs 85.18% <80.00%> (+2.20%) :arrow_up:
...eColumn.BinaryOperationImplementations.Exploded.cs 52.27% <0.00%> (ø)

... and 12 files with indirect coverage changes

asmirnov82 commented 10 months ago

Using Benchmarks from #6826

image

asmirnov82 commented 10 months ago

@JakeRadMSFT could you please take a look