Open winterland1989 opened 8 years ago
Since we have to construct a new ByteString to foldCase the original we can't avoid asking for pinned memory.
What we could do is add an instance FoldCase ShortByteString
. Care to write PR?
OK, I'll send one. please reopen to track this.
BTW, what's the purpose of this rewrite rule?
{-# RULES "foldCase/ByteString" foldCase = foldCaseBS #-}
For some reason that RULE made the benchmark faster.
What if we implemented CI
using a type family? then we can keep original ByteString
slice and do a more efficient copy to FoldedCase ByteString
. I think this is the best option but it has some compatibility issue. What do you think?
type family FoldedCase a where
FoldedCase B.ByteString = Short.ShortByteString
FoldedCase BL.ByteString = [Short.ShortByteString]
FoldedCase T.Text = T.Text
FoldedCase TL.Text = TL.Text
data CI s = CI { original :: !s -- ^ Retrieve the original string-like value.
, foldedCase :: !(FoldedCase s) -- ^ Retrieve the case folded string-like value.
-- (Also see 'foldCase').
}
Another reason i propose this solution is that the document of ShortByteString
says It is suitable for use as an internal representation for code that needs to keep many short strings in memory, but it should not be used as an interchange type.
.
Another approach is to provide a Data.CaseInsensitive.ByteString
module which exports a specialized CIByteString
type using ShortByteString
internally. So with providing ShortByteString
instance we have three options here.
Constructing a
CI ByteString
will ask for pinned memory, but usually theByteString
is short so this behavior not only add overhead but contribute to heap fragment. I think we can do better here, any idea?