Closed frasertweedale closed 6 years ago
Interning is not a win. There is a substantial amount of additional processing and allocation just for managing the pointers that let us hash the memory without copying it.
The hs-notmuch-tag-count
program was used for testing, on a database with > 160,000 mails
and > 180,000 tags:
[T470s:~/dev/hs-notmuch] [ tag-fu● ] ftweedal% cabal run hs-notmuch-tag-count -- ~/mail.bak '*' +RTS -p
Preprocessing library notmuch-0.1.0.0...
Preprocessing executable 'hs-notmuch-tag-count' for notmuch-0.1.0.0...
Running hs-notmuch-tag-count...
fromList [("unread",58),("replied",2537),("attachment",11203),("encrypted",1),("flagged",102),("signed",8004),("inbox",161474)]
Prior to interning (using a custom Tag representation) the profile summary is:
total time = 8.34 secs (8335 ticks @ 1000 us, 1 processor)
total alloc = 244,189,264 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
withMessageHandle Notmuch.Binding src/Notmuch/Binding.chs:74:1-68 50.9 5.8
tagsToList Notmuch.Binding src/Notmuch/Binding.chs:(571,1)-(576,16) 26.4 3.9
withMessages Notmuch.Binding src/Notmuch/Binding.chs:73:1-58 16.1 2.6
detachPtr Notmuch.Talloc src/Notmuch/Talloc.chs:(29,1)-(30,61) 4.2 0.0
go Main tools/TagCount.hs:(43,1)-(49,54) 0.5 19.6
accumTag Main tools/TagCount.hs:52:1-69 0.5 35.8
tagFromCString Notmuch.Tag src/Notmuch/Tag.hs:(81,1)-(83,35) 0.2 6.6
messagesToList Notmuch.Binding src/Notmuch/Binding.chs:(587,1)-(592,80) 0.2 21.7
message_get_tags Notmuch.Binding src/Notmuch/Binding.chs:(438,1)-(439,46) 0.0 2.6
hash Data.HashMap.Base Data/HashMap/Base.hs:143:1-28 0.0 1.2
After interning it is
total time = 8.37 secs (8372 ticks @ 1000 us, 1 processor)
total alloc = 304,709,328 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
withMessageHandle Notmuch.Binding src/Notmuch/Binding.chs:74:1-68 51.3 4.7
tagsToList Notmuch.Binding src/Notmuch/Binding.chs:(564,1)-(569,31) 25.4 4.6
withMessages Notmuch.Binding src/Notmuch/Binding.chs:73:1-58 16.6 2.1
detachPtr Notmuch.Talloc src/Notmuch/Talloc.chs:(29,1)-(30,61) 4.0 0.0
intern Data.Interned.Internal Data/Interned/Internal.hs:(78,1)-(87,33) 0.5 16.9
go Main tools/TagCount.hs:(43,1)-(49,54) 0.4 15.7
accumTag Main tools/TagCount.hs:52:1-69 0.4 28.7
internTagFromCString Notmuch.Tag src/Notmuch/Tag.hs:(145,1)-(147,45) 0.2 4.8
messagesToList Notmuch.Binding src/Notmuch/Binding.chs:(580,1)-(585,80) 0.2 17.4
message_get_tags Notmuch.Binding src/Notmuch/Binding.chs:(431,1)-(432,46) 0.0 2.1
hashPtrWithSalt Data.Hashable.Class Data/Hashable/Class.hs:(714,1)-(716,23) 0.0 1.9
tagFromCString
is where the copying to pinned memory occurs. Its cost centre report goes from:
tagFromCString Notmuch.Tag src/Notmuch/Tag.hs:(81,1)-(83,35) 236 183379 0.2 6.6 0.2 6.6
to
tagFromCString Notmuch.Tag src/Notmuch/Tag.hs:(140,1)-(142,35) 249 7 0.0 0.0 0.0 0.0
Which is exactly what we expect: 183379 allocations down to 7 - one for each tag. But the overhead is much higher and is not a performance win, but it is an abject complexity loss.
Was a fun experiment. Committed to https://github.com/purebred-mua/hs-notmuch/tree/feature/intern-tags / https://github.com/purebred-mua/hs-notmuch/commit/2fda75caa1d904ed1333b985963edf5cdebfa4c2 for this historical record.
When we get a list of tags, we do a string copy of every tag. But there will be a relatively small number of tags in a database. Investigate whether interning them is a win.
See also the http://hackage.haskell.org/package/intern package.