Closed NightOwl888 closed 2 years ago
I took a look at this and how changing Lucene.Net.Search.FieldDoc
to be generic would affect other types, and have come up with a possible way we can eliminate boxing/unboxing, but it does require some breaking changes across public APIs.
FieldDoc // Make FieldDoc<T>
FieldDoc.Fields // Change from object[] to T[]
IndexSearcher.SearchAfter(ScoreDoc after, Query query, int n, Sort sort) // Change ScoreDoc to FieldDoc<T>
IndexSearcher.SearchAfter(ScoreDoc after, Query query, Filter filter, int n, Sort sort) // Change ScoreDoc to FieldDoc<T>
IndexSearcher.SearchAfter(ScoreDoc after, Query query, Filter filter, int n, Sort sort, bool doDocScores, bool doMaxScore) // Change ScoreDoc to FieldDoc<T>
IndexSearcher.Search(Weight weight, FieldDoc after, int nDocs, Sort sort, bool fillFields, bool doDocScores, bool doMaxScore) // Change FieldDoc to FieldDoc<T>
TopFieldCollector.Create(Sort sort, int numHits, FieldDoc after, bool fillFields, bool trackDocScores, bool trackMaxScore, bool docsScoredInOrder) // Change FieldDoc to FieldDoc<T>
TopFieldCollector.PagingFieldCollector // Change to PagingFieldCollerctor<T> (this class is private)
TopFieldCollector.PagingFieldCollector(FieldValueHitQueue<Entry> queue, FieldDoc after, int numHits, bool fillFields, bool trackDocScores, bool trackMaxScore) // ctor: Change FieldDoc to FieldDoc<T>
FieldComparer.SetTopValue(object value) // Change to SetTopValue<TValue>(TValue value)
FieldComparer<T>.SetTopValue(object value) // Change to SetTopValue(T value) and overload SetTopValue<T>(T value).
ToParentBlockJoinFieldComparer : FieldComparer<object> // Rely on SetTopValue<T>(T value) rather than SetTopValue(object value)? The class cannot be made generic easily due to its usage in ToParentBlockJoinSortField.GetComparer(int numHits, int sortPos).
NOTE: There are also several subclasses of
FieldComparer<T>
that can remove the casting, especially in cases where it is currently object to numeric type.
FieldComparer<T>
was only generic in Lucene, but had to be widened in C# because there is no common type between different closing types in C# (FieldComparer<long>
is a different type than FieldComparer<float>
and cannot be put into a strongly typed array together). So, another abstraction was added, FieldComparer
to be able to pass the type without knowing the closing type.
The FieldComparer<T>
class would define a default implementation of SetTopValue<TValue>(TValue value)
by casting from TValue
to T
using an intermediate cast to object, but can be overridden to eliminate boxing in subclasses where boxing applies.
public abstract SetTopValue(T value);
public override SetTopValue<TValue>(TValue value) // Added for .NET to eliminate boxing
{
SetTopValue((T)(object)value);
}
This is a little weird because the base type uses a generic method and the subclass is a generic closing type, but given the leaky nature of generics and the fact that some types don't have a reasonable way to pass the generic without also becoming generic this seems like a reasonable compromise. I am open to suggestions if someone can come up with something that seems more natural than this.
NOTE:
FieldComparer.CompareValues(object first, object second)
also casts the generic type in the generic overload (the object overload was added in .NET because there is no common type between generic closing types in C#). Perhaps it would be better to make itFieldComparer.CompareValues<TValue>(TValue first, TValue second)
on the base class as well. This leaks into Lucene.Net.Grouping types,ISearchGroup.SortValues
andMergedGroup.TopValues
are also declared as type object[] and may contain numeric types.I am mentioning this here because it is another method using the same generic type on FieldComparer
.
A potential problem with the above approach is that it is difficult to cast TValue
to a value type, such as float
. There are some potential solutions we could use here: https://stackoverflow.com/questions/3343551/how-to-cast-a-value-of-generic-type-t-to-double-without-boxing. Since the whole point of these changes is to eliminate boxing, it is probably worth sticking one of those solutions (whichever benchmarks fastest) in the Support/Util folder so we can use it throughout the solution for all of the subclasses of FieldComparer<T>
.
So, @rclabo and I have been analyzing this in more detail and it is clear that FieldDoc.Fields
cannot be made generic because it contains a mixed bag of field types.
As an alternative, some design proposals to FieldComparer<T>
have been benchmarked in BenchmarkFieldComparer to see what the impact of such changes might be.
Clearly, allocations used in the FieldValueHitQueue<T>.FillFields()
method and when moving the numeric data out of the fields can be dramatically reduced.
The leading contender, BoxingWrappedReferenceScenario takes us a step back toward the Lucene design and although the details are not worked out, here are some of the basics:
FieldComparer<T>
gets a generic constraint to disallow value types.NumericComparer<T>
is changed to NumericComparer<TValue, TWrapper>
. TWrapper
is the type that is used in its base class FieldComparer<T>
and TValue
is the value type that is exposed on the public API.J2N
.
IConverible
and IComparable<T>
.TWrapper
type in NumericComparer<TValue, TWrapper>
.Field
to the FillFields()
method without having to unwrap and rewrap the numeric value in a reference type.Of course, none of this is set in stone, but it is clear that fixing this problem will likely involve breaking some public APIs at least a little so I am moving this to the 4.8.0-beta00015
milestone.
The
Lucene.Net.Search.FieldDoc.Fields
property returnsobject[]
. However, according to the documentation, it is designed to be used withstring
,int
, orfloat
.In Java, this was not an issue because there are numeric reference types that primitive numbers are wrapped in when passed around. But, since it would not be a good user experience to require wrapper classes for numeric types in .NET we should solve the boxing issue another way.