microsoft / msix-packaging

MSIX SDK
MIT License
963 stars 163 forks source link

[BUG] Memory consumption is very high during the extraction of files in MSIX #592

Closed guptashwGit closed 4 months ago

guptashwGit commented 11 months ago

Project MSIX SDK

Describe the bug When extracting the files in the package, The memory consumption is very high

To Reproduce

  1. Use https://github.com/microsoft/msix-packaging/tree/MSIX-Core-1.2-release
  2. Run the ExtractContentsSample executable with MSIX package with significant number of files (For e.g we have tried the files only with MSIX Package which contains 7913 files and the size of VFS folder is 718MB)
  3. When the executable is running - its memory size is getting increased (Please, refer to the snapshot wrt the memory profiling of ExtractPayloadFiles method at different stages. The memory has been increased from 64MB to 554MB

Expected behavior The memory should be released once when it is not required. Wrt the implementation - the input file stream created here: https://github.com/microsoft/msix-packaging/blob/fae9d57e627b4331d362a6f3430fdf1f7e31df9d/sample/ExtractContentsSample/ExtractContentsSample.cpp#L223 should be removed after the exit of this method as input file stream memory is not required after the completion of this method. However, it is not cleaning up as the reference count of it is not 0 (probably, because the input stream pointer is associated with each AppxFile object: https://github.com/microsoft/msix-packaging/blob/fae9d57e627b4331d362a6f3430fdf1f7e31df9d/src/inc/internal/AppxFile.hpp#L90 and all these file objects are part of the entire payload files vector: https://github.com/microsoft/msix-packaging/blob/fae9d57e627b4331d362a6f3430fdf1f7e31df9d/sample/ExtractContentsSample/ExtractContentsSample.cpp#L269 We should release the memory of input as well as output stream after the execution of [ExtractFile](https://github.com/microsoft/msix-packaging/blob/fae9d57e627b4331d362a6f3430fdf1f7e31df9d/sample/ExtractContentsSample/ExtractContentsSample.cpp#L203) method.

Screenshots image

Platform Windows 10 64 bit (32 GB RAM and AMD EPYC 7713 64 bit Core Processor 2.00 GHZ wth 2 processors)

Additional context We are also observing one weird thing - the system has significant amount of memory but we are getting out of memory error for our MSIX package under SYSTEM account with no user logged in (using SCCM)( the MSIX package installation is failing with ERROR_OUTOFMEMORY once it is capturing Memory ~540 MB). The same issue is not happening when some user is loggedinto the machine and the installation is triggered in SYSTEM context only. Please, note that we are using the MSIX SDK to enable the installation of files through MSIX in actual physical location.

msftrubengu commented 11 months ago

The memory of the input streams are not released because the owner is the IAppxPackageReader object (by the time the stream is retrieve in ExtractFile the ref count should be 3 or so). The output stream is released once the ExtractFile call is completed.

What is happening here is that one the stream gets copied into the output stream, if the stream hasn't been inflated we will allocate some buffers. Those allocations are what you are seeing between the two breakpoints in the screen shot.

Now, I agree that this is problematic for big packages. The easiest way to fix it would be to clear those buffers once the inflate is done which is basically calling

                self->m_compressedBuffer.get()->clear();
                self->m_compressedBuffer.get()->shrink_to_fit();
                self->m_inflateWindow.get()->clear();
                self->m_inflateWindow.get()->shrink_to_fit();

in this block https://github.com/microsoft/msix-packaging/blob/fae9d57e627b4331d362a6f3430fdf1f7e31df9d/src/msix/unpack/InflateStream.cpp#L116

If you can easily repro can you try it?

As of the system account issue, I'm not that really familiar so I can't assure that there's a memory limit where there are no users logged in.

guptashwGit commented 10 months ago

@msftrubengu Thanks a lot for your reply. We have tried it and it is working. I have one query wrt parallel installation of multiple MSIX packages via MSIX SDK. The ExtractFile is using IStream class which is inherited from ISequentialStream. Is it right to invoke fileStream->CopyTo from different threads simultaneously ? We have observed some crashes in the past wrt parallel extraction of MSIX packages and hence, we want to understand whether the parallel extraction using ISequentialStream is supported or not.

msftrubengu commented 10 months ago

Sadly the MSIX SDK doesn't support parallel unpack, but I don't see how just calling CopyTo in a different thread after the IAppxFile was obtained via IAppxFilesEnumerator::GetCurrent will fail.

You just need to make sure that the IAppxFilesEnumerator calls are made in the main thread and being careful about the ref count for it.

I might be wrong, but it is worth trying.

guptashwGit commented 9 months ago

@msftrubengu Just want to confirm wrt parallel unpacking of MSIX packages using MSIX-SDK Is it ok to perform unpack operation for different MSIX packages together ? So, we can always perform the unpack operation of a single MSIX package in a single thread but it is completely alright to invoke the unpack of multiple MSIX packages together in separate threads.

As, per your above comment: You just need to make sure that the IAppxFilesEnumerator calls are made in the main thread Currently, we are keeping a global mutex here: https://github.com/microsoft/msix-packaging/blob/fae9d57e627b4331d362a6f3430fdf1f7e31df9d/MsixCore/msixmgr/Extractor.cpp#L92 (as in our case Multiple threads can invoke Extractor::ExtractPayloadFiles() together and hence, using a global static mutex we are controlling the execution of this method's logic for one MSIX package at a time)

Due, to this change - we are facing performance issues and hence, I am exploring the option to keep the unpacking of multiple MSIX packages in separate threads at the same time.

guptashwGit commented 8 months ago

@mtodd @tpope @sverrejoh @msftrubengu : I am still waiting for the reply on my above query.

msftrubengu commented 4 months ago

As long is in a different thread calling unpack should work.