ArsenalRecon / Arsenal-Image-Mounter

Arsenal Image Mounter mounts the contents of disk images as complete disks in Microsoft Windows.
https://ArsenalRecon.com/weapons/image-mounter
Other
496 stars 85 forks source link

Write cache Major problem on Arsenal #45

Closed rimaxr closed 6 months ago

rimaxr commented 6 months ago

Hello,

I am encountering an issue across all versions of Arsenal. The problem arises when you enable write cache on a local disk, and it exacerbates if the local disk has a higher write speed (SSD/NVMe). You can reproduce the issue by mounting a VHD image from a Linux Samba or an iSCSI drive and enabling write cache. Initially, it starts writing the differential file, but after a while, the write speed drops to 0, and the computer hangs. It appears to be losing access to the virtual drive.mipos we tried in windows 10 full or not full updated and windows 11 with the same behavior! I will give you the commands that i use to help you reproduce. 1 .VHD c:\ArsenalImageMounter\aim_cli.exe /mount /filename="\x.x.x.x\Name.vhd" /writeoverlay=C:\LocalDiffs\Name.vhd.diff

  1. ISCSI c:\ArsenalImageMounter\aim_cli.exe /mount /readonly /provider=DiscUtils /filename=\?\PhysicalDrive2 /writeoverlay=C:\LocalDiffs\Name.diff

Originally posted by @rimaxr in https://github.com/ArsenalRecon/Arsenal-Image-Mounter/issues/42#issuecomment-1963950046

LTRData commented 6 months ago

Thanks for your report.

Could you explain a bit what your intention is with aim_cli.exe /mount /readonly /provider=DiscUtils /filename=\?\PhysicalDrive2 /writeoverlay=C:\LocalDiffs\Name.diff? It looks a bit strange to me. A raw disk should not have DiscUtils libraries as provider. Also, have you made sure that PhysicalDrive2 is offline before doing this? Otherwise, updates on it that may occur while the disk is mounted as another disk by AIM and that could lead to strange file system corruption that could cause behavior like this.

But I'll do some experiments too and see if I can reproduce this behavior!

rimaxr commented 6 months ago

Hello,

Don't worry about mounting physical disks; conduct your tests on a VHD image as it exhibits the same behavior. It could be an empty VHD; it doesn't matter. When you enable write cache on a local disk and attempt to copy large files inside it, the system hangs. I tried copying 4x 4GB files.

I conducted the test with the VHD locally on the same disk, not via the network, and encountered the same results (system hang).

Regarding the question about mounting the physical disk, it behaves the same whether the provider is set to "none" or the Pdisk is offline.

LTRData commented 6 months ago

I have tried something similar now a few times, but I cannot say I see anything like this issue.

Here for example, I copied 4 x 6 GB iso images. VHD mounted from a Samba share, diff on local disk which is a very fast NVMe in this case. image

When the computer hangs, how "serious" is the hang? Is it possible to open any applications? To close open applications? If you have Task Manager with the performance tab open before this happens and keep an eye on the graph and metrics for local disk when the hang happens, does it seem to continue with lots of disk I/O on the local disk when it hangs? How is the RAM usage, does it increase a lot before the hang and then slowly drops while the OS is hung? There could be other cache related issues going on here.

rimaxr commented 6 months ago

Let's start by saying that I want to thank you for the quick responses.

If you use --autodelete on your mount and notice that the differential file doesn't reflect what you've copied (it doesn't grow at all), it retains the size of the initial differential file that was created when you mounted it.

With --autodelete, it appears that it utilizes the RAM for caching, and upon unmounting, the local differential file becomes empty again!

LTRData commented 6 months ago

Thanks for this explanation too. I actually tried both with and without autodelete, but I could not see any noticable difference in the behavior. The only real difference is that with autodelete, it is not easy to follow the true file size because it is not possible to query file properties (the file has a pending disposal in the file system and cannot be opened again). It is possible to see the file size cached in the directory listing though, but it always stays at zero and is not really useful.

rimaxr commented 6 months ago

Without the autodelete option, the program retains all changes in the differential file. This is how the local differential is should be. Whenever I mount the image again, I have my additional files inside (the diff is merged with the original image), until I manually delete the diff file to return to my original files.

However, with the autodelete option, every time I dismount, I lose my diff files, as they are temporary.

LTRData commented 6 months ago

Yes absolutely, that is how it is designed to work. I just meant that I could not see any difference in file copying speed, hangs etc with or without that switch. There was a small difference in RAM usage, but not very big.

rimaxr commented 6 months ago

Hello Again, I record the behavior and i am attaching you the file in zip. the tests are the same at 3 diferent Pcs with windows 10. in the video you can see that when it goes to 0, then the mount drive is not readable and many other things cannot open until the pc is ussless and needs restart. [Uploading My Movie.zip…]() I put it on streamable too https://streamable.com/b54tae

LTRData commented 6 months ago

I have tried a few times more on different Windows 10 and 11 systems and I cannot see anything similar. I do see an increase in RAM usage when copying large files so it fills up a file system cache, but over time it flushes down to the AIM drive and the differencing image file. But in your case it seems it just uses up all available RAM cache space and the flush to disk never happens, or gets stuck somewhere. Could you try this on a freshly installed system with no other applications installed and see if you see anything similar?

Oneirot commented 6 months ago

Hey Olof, I'm working with rimaxr. I also want to thank you as well for the swift responses and the help you have been providing.

We did some further testing and we got some interesting results. We freshly installed a new computer with Win11 and a gen4 Nvme with 32gigs of ram, the problem was gone. Then on the same computer we installed win 10 fresh, and still it worked without hanging.

We decided to use our previous testing rig swapping out the SSD we had been using and putting in the NVME while also installing win11 fresh. It still worked.

For our next test we swapped in the original SSD and fresh win11, and sure enough we made the problem happen. Then we tested it on a second computer with a different SSD and also got it to crash.

It seems as though if the read of the files and also the diff is on an SSD drive, it has this weird behavior. We even tried to mix and match. Win11 and data to copy on the Nvme, diff on the SSD and got super inconsistent copies, regularly dipping to 0 and maxing the RAM while also making the computer freezy-ish while it tried to empty the RAM afterwards. We got it to hang once but generally it was kinda working albeit with often freezes and a lot of waiting.

LTRData commented 6 months ago

Thanks for testing! I never tried with source data on the same drive as the diff, I could do some tests like that too and see what happens. I am really curious now, if I could get this to happen on a virtual machine running under a kernel debugger, it will be a lot easier to analyze the hang.

LTRData commented 6 months ago

I think I have seen the same behavior here now. I tried to copy large files from local disk instead of from network location and it looks very similar to your video. I'll try to run it in a debugger and see what I can find and what we could do about it!

LTRData commented 6 months ago

Found and fixed! Thanks again for reporting this issue and for your help with debugging it!