msgpack / msgpack-cli

MessagePack implementation for Common Language Infrastructure / msgpack.org[C#]
http://msgpack.org
Apache License 2.0
835 stars 175 forks source link

Serialization / deserialization speed optimizations #357

Open KeithVinson opened 1 year ago

KeithVinson commented 1 year ago

Hello All, I am wondering how best to control Message Pack to maximize the speed at which large objects can be read / written to disk. In my specific case the files on disk are approaching 10 GB in size. They contain fairly large Sorted Dictionaries. When I load them into memory the process can take upwards of 6-8 minutes to load. This process is running on a very beefy server with dual CPU totaling 32 cores and having 256 GB of memory. When reading the disk I can see periods of up to 750 MB / second transfer speeds, but there are periods where no disk transfers are occurring even though the load process has not completed. I am wonder if their are some options I should select for the MessagePack Serializer that could improve my performance. When accessing the disks (NVMe in a RAID 5 array) I see a very curious "ringdown" pattern on disk transfers as the load process proceeds. Any help you might suggest would be welcome.

Here is a sample of the code used to load the Dictionaries: public static T ReadMsgPkFile<T> (string filePath) { using (Stream stream = File.Open (filePath, FileMode.Open, FileAccess.Read, FileShare.Read)) { return (T)MessagePackSerializer.Deserialize<T> (stream); } }

RemoteDesktopManager_jXRwtTRJgc
yfakariya commented 1 year ago

MessagePack for CLI simply uses System.IO.Stream, so you might improve performance of I/O. For example, you can adjust buffer size of your FileStream with its constructor parameter to be the write more chunky.

KeithVinson commented 1 year ago

Thank you, I will check it out!

Keith Vinson CTO Image Access, Inc. 543 NW 77th Street, Boca Raton, FL 33487 Tel: 561-886-2951 Fax: 561-431-2766 @.*** Image DLSG is a Division of Image Access, Inc. www.ImageAccess.com / www.DLSG.com DLSG is the Most Trusted Name in Digitization for Universities ▪ 49 of the top 50 universities are DLSG customers ▪ DLSG products serve over 70% of students at US universities ImageVisit us on Facebook to see what people are posting about KIC!


From: Yusuke Fujiwara @.> Sent: Saturday, May 20, 2023 8:13:36 PM To: msgpack/msgpack-cli @.> Cc: Keith Vinson @.>; Author @.> Subject: Re: [msgpack/msgpack-cli] Serialization / deserialization speed optimizations (Issue #357)

MessagePack for CLI simply uses System.IO.Stream, so you might improve performance of I/O. For example, you can adjust buffer size of your FileStream with its constructor parameter to be the write more chunky.

— Reply to this email directly, view it on GitHubhttps://github.com/msgpack/msgpack-cli/issues/357#issuecomment-1556056671, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKVJS5JWV5ET255R3ZL5NY3XHFT4BANCNFSM6AAAAAAYHB4ZCI. You are receiving this because you authored the thread.Message ID: @.***>