Closed GoogleCodeExporter closed 8 years ago
2nd improvement:
In thrust\detail\device\cuda\block\set_intersection.inl:
RandomAccessIterator4 set_intersection(RandomAccessIterator1 first1,
RandomAccessIterator1 last1,
RandomAccessIterator2 first2,
RandomAccessIterator2 last2,
RandomAccessIterator3 temporary,
RandomAccessIterator4 result,
StrictWeakOrdering comp):
Replace:
// count the number of previous occurrances of x the first range
difference1 rank = x - thrust::detail::device::generic::scalar::lower_bound(first1,x,dereference(x),comp);
// count the number of equivalent elements of x in the second range
thrust::pair<RandomAccessIterator2,RandomAccessIterator2> matches =
thrust::detail::device::generic::scalar::equal_range(first2,last2,dereference(x),comp);
difference2 num_matches = matches.second - matches.first;
// the element needs output if its rank is less than the number of matches
needs_output = rank < num_matches;
with:
// count the number of previous occurrances of x the first range
difference1 rank = x - thrust::detail::device::generic::scalar::lower_bound(first1,x,dereference(x),comp);
// count the number of equivalent elements of x in the second range
// unpack equal_range because we need less information than it gives us
RandomAccessIterator1 lb = thrust::detail::device::generic::scalar::lower_bound(first2,last2,dereference(x),comp);
// only need to know if (lb - ub) is larger than rank, not exactly how many elements of dereference(x) there are. This should often -
// if dereference(x) is not repeated and thus rank == 1 - gives us a much smaller search range
RandomAccessIterator1 ub = thrust::detail::device::generic::scalar::upper_bound(lb,min(lb+rank+1,last2),dereference(x),comp);
difference2 num_matches = ub - lb;
// the element needs output if its rank is less than the number of matches
needs_output = rank < num_matches;
saves you a binary search in the common case.
dt
Original comment by jaredhoberock
on 15 Jan 2011 at 12:39
This issue was closed by revision 0b8588e91d.
Original comment by jaredhoberock
on 8 Feb 2011 at 5:04
Original issue reported on code.google.com by
jaredhoberock
on 13 Jan 2011 at 11:29