adamhathcock / sharpcompress

SharpCompress is a fully managed C# library to deal with many compression types and formats.
MIT License
2.27k stars 480 forks source link

SharpCompress uses too much memory and it's slower #109

Closed albracko closed 8 years ago

albracko commented 8 years ago

Hi,

with the latest version, when i open rar archive that's 900MB (uncompressed size 7,5GB) big and has about 33k files, and try to extract 1 file, the memory consumption goes up by about 3GB.

On older version i used before, 0.10.3.0 memory consumption went up by only 200MB. So memory consumption went up 15times

Also there is a big big difference in time it takes to find the file you are looking for. Old version 34792 ms New version 247800 ms 8 times slower

adamhathcock commented 8 years ago

Wow that sucks. Can you post a snippet of how you're extracting?

albracko commented 8 years ago

Hey,

Yeah sure i can...give me 15-30 min, so i can get back behind my machine.

Cheers

Sent from my Windows Phone


From: Adam Hathcockmailto:notifications@github.com Sent: ‎24.‎12.‎2015 11:45 To: adamhathcock/sharpcompressmailto:sharpcompress@noreply.github.com Cc: Aleksander Bračkomailto:aleksander.bracko@crea.si Subject: Re: [sharpcompress] SharpCompress uses too much memory and it's slower (#109)

Wow that sucks. Can you post a snippet of how you're extracting?

— Reply to this email directly or view it on GitHubhttps://github.com/adamhathcock/sharpcompress/issues/109#issuecomment-167089704.

albracko commented 8 years ago

Hey,

Below is a snipet of extraction. Now, the difference between the old code and the new code is that the Entry object doesn’t have the FilePath property to check the file name. Instead I have to use the Key property. Don’t know if that’s the reason for the difference though.

  foreach (var rar in rars)
  {
    var archive = ArchiveFactory.Open(rar);
    if (archive.Entries.Any(x => x.Key.Contains(id)))
    {
      using (Stream stream = File.OpenRead(rar))
      {
        var reader = ReaderFactory.Open(stream);
        while (reader.MoveToNextEntry())
        {
          if (!reader.Entry.IsDirectory && reader.Entry.Key.Contains(id))
          {
            reader.WriteEntryToDirectory(@"C:\", ExtractOptions.ExtractFullPath | ExtractOptions.Overwrite);
            break;
          }
        }
      }
    }
  }

Cheers, Alex

From: Adam Hathcock [mailto:notifications@github.com] Sent: Thursday, December 24, 2015 11:46 AM To: adamhathcock/sharpcompress sharpcompress@noreply.github.com Cc: Aleksander Bračko aleksander.bracko@crea.si Subject: Re: [sharpcompress] SharpCompress uses too much memory and it's slower (#109)

Wow that sucks. Can you post a snippet of how you're extracting?

— Reply to this email directly or view it on GitHubhttps://github.com/adamhathcock/sharpcompress/issues/109#issuecomment-167089704.

adamhathcock commented 8 years ago

I'll still investigate a new memory issue but I probably won't have time until the new year. However, it is weird you're using the random access archive along side the reader interface.

You might be better off skipping the reader part. If you find an entry you want just call OpenEntryStream on it.

albracko commented 8 years ago

Ok, thx for the hint. Will do that....

Sent from my Windows Phone


From: Adam Hathcockmailto:notifications@github.com Sent: ‎24.‎12.‎2015 18:05 To: adamhathcock/sharpcompressmailto:sharpcompress@noreply.github.com Cc: Aleksander Bračkomailto:aleksander.bracko@crea.si Subject: Re: [sharpcompress] SharpCompress uses too much memory and it's slower (#109)

I'll still investigate a new memory issue but I probably won't have time until the new year. However, it is weird you're using the random access archive along side the reader interface.

You might be better off skipping the reader part. If you find an entry you want just call OpenEntryStream on it.

— Reply to this email directly or view it on GitHubhttps://github.com/adamhathcock/sharpcompress/issues/109#issuecomment-167138306.

adamhathcock commented 8 years ago

I think this is related to this closed PR: https://github.com/adamhathcock/sharpcompress/pull/107

adamhathcock commented 8 years ago

Try the new release and see what works. I definitely recommend using Reader over Archive for performance if you don't need random access.

albracko commented 8 years ago

Hey,

So, i tried out new version…it uses less memory, so i'd say it’s fixed…speed-wise it’s about the same. ☺

I also tried out OpenEntryStream() as you suggested. But it doesn’t work since I get Object reference not set exception.

Here’s the stack:

04.01.16 08:40:01.011 0001E 00000 ERROR System.NullReferenceException: Object reference not set to an instance of an object. at SharpCompress.Compressor.Rar.Unpack.copyString(Int32 length, Int32 distance) at SharpCompress.Compressor.Rar.Unpack.unpack29(Boolean solid) at SharpCompress.Compressor.Rar.Unpack.doUnpack() at SharpCompress.Compressor.Rar.Unpack.doUnpack(FileHeader fileHeader, Stream readStream, Stream writeStream) at SharpCompress.Compressor.Rar.RarStream..ctor(Unpack unpack, FileHeader fileHeader, Stream readStream) at SharpCompress.Archive.Rar.RarArchiveEntry.OpenEntryStream()

Cheers, Alex

From: Adam Hathcock [mailto:notifications@github.com] Sent: Tuesday, December 29, 2015 1:04 PM To: adamhathcock/sharpcompress sharpcompress@noreply.github.com Cc: Aleksander Bračko aleksander.bracko@crea.si Subject: Re: [sharpcompress] SharpCompress uses too much memory and it's slower (#109)

Try the new release and see what works. I definitely recommend using Reader over Archive for performance if you don't need random access.

— Reply to this email directly or view it on GitHubhttps://github.com/adamhathcock/sharpcompress/issues/109#issuecomment-167774473.