dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.94k stars 1.86k forks source link

Improve Nullable support during dataframe arithmetic operations #6825

Closed asmirnov82 closed 9 months ago

asmirnov82 commented 10 months ago

During arithmetic operations dataframe performs cloning the left side column into the result to have validity bitmap and than checks the right side validity bitmap for NULL value.

For example for Multiply we do cloning in case of inPlace parameter is set to false (default behavior):

PrimitiveDataFrameColumn<U> newColumn = inPlace ? primitiveColumn : primitiveColumn.Clone();
newColumn._columnContainer.Multiply(column._columnContainer);

and inside container for each value we check validity:

 for (int i = 0; i < span.Length; i++)
 {
     if (BitmapHelper.IsValid(right.NullBitMapBuffers[b].ReadOnlySpan, i))
     {
         span[i] = (double)(span[i] * otherSpan[i]);
     }
     else
     {
         left[index] = null;
     }

     index++;
 }

Validity check is a very slow operation. It's possible to calculate Raw values and then use binary logic (AND) for calculating validity bitmap for whole byte.

//calculate raw values
for (int i = 0; i < span.Length; i++)
{                
    resultSpan[i] =  (double)(span[i] * otherSpan[i]);
}

//Calculate validity (nulls)
resultValidityBitmap = Bitmap.ElementWiseAnd(validityBitmap, otherValidityBitmap));
asmirnov82 commented 10 months ago

It's also possible to get rid of cloning of the left side and create new empty column for results instead. However investigation shows, that there isn't any dramatical improvement of performance on avoiding Cloning. On the other hands, it requires quite a lot of code changes in both PrimitiveDataFrameColumn.BinaryOperations.tt and PrimitiveDataFrameColumn.BinaryOperationImplementations.Exploded.tt (current implementation of DataFrame provides two different implementations for arithmetic calculation: one for PrimitiveDataFrameColumn and another for inheritors, like Int32DataFrameColumn and etc). So I decided to postpone remediation of the left part cloning until templates files are simplified and duplicating implementations are removed.

Here is the result of my experimentation (first column is speed with just enhanced nullable, second column is enhanced nullable + avoiding cloning): image

asmirnov82 commented 9 months ago

Final results, when PR is implemented

image