Open msedi opened 3 years ago
Tagging subscribers to this area: @dotnet/area-system-io See info in area-owners.md if you want to be subscribed.
Author: | msedi |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.IO`, `untriaged` |
Milestone: | - |
Our team would be very interested in some more discussions about this topic. Would there be a chance to do so? We could even help in improving this, but we would need some agreement and further discussions. Since thare some interest in this topic I'll list (incomplete) them here for reference #59606, #57330, #37227, #59405. #62768, #69365, #48793, #941, #24990, #24805. Many of them are still open, many of them have been closed but not solved.
I can see that memory mapped files might be a niche topic, but I think there is interest.
@jeffhandley Would it be possible to get your insight as as an area owner?
@msedi Would it be useful to have something for madvise
won't need
even if it was noop on Windows? DiscardVirtualMemory
has a different semantics.
Background and motivation
Currently, the MemoryMappedFile API is great but doesn't expose some properties that are available in the WinAPI. While I understand that also other platforms are currently supported it would be nice to find a common way to extend the API.
The suggestions are:
Enable FileOptions Currently when using MemoryMappedFile.CreateFromFile no FileOptions property is allowed to be handed over to this method. Some of the enums in FileOptions may be useless, some not (e.g. FileOptions.DeleteOnClose). Currently there exists a MemoryMappedFileOptions that could be extended so that the options can be abstracted rather than using FileOptions directly.
Large Page Support It is not possible to use Large_Pages, from the WinAPI I can see that it is only possible in system paging files and not in file backed paging file, but I don't have too much experience on that.
More control over memory It's currently not possible to invalidate memory (DiscardVirtualMemory) so that the memory manager can ignore these areas and will not write it back to the paging file. Additionally it is currently not possible to mark pages as "not in use" (VirtualUnlock) so that the memory manager is able to page them earlier. Also it is not possible to prefetch pages (PrefetchVirtualMemory).
Better backing file control It seems that some options but I do not have good benchmarks allow for better performance. SetFileValidData and Sparse File Support are the keywords. My current tests showed that using sparse files, had an improvement from 150s to 100s.
Flush areas FlushViewOfFile allows more control of which areas are flushed and should be available in MemoryMappedViewAccessor.Flush.
Do not flush the view on disposal Currently the filestream is flushed when disposing the memory mapped structures, which are causing enormous performance drop when the file is opened with FileOptions.DeleteOnClose. Since the file is deleted on disposal it doesn't really make sense to flush it to the backing file.
Control over the working set While working with memory mapped files I was produced a tremendous amount of memory and came very quickly to a point where the memory was exhausted and the memory manager started to page and to offload the data to the pagefile. While this behavior is OK and is dependent on the OS (I was told that linux handles memory mapped files better?), it started way too late so I built my own "unmanaged GC" which watched in a thread over the memory mapped files. The problem was though that even when I unlocked the memory region (VirtualUnlock) the workingset was still high so I need to enforce a flush of the working set. All of the Virtual* methods are non-blocking though and I haven't found a good way to wait until the workingset reached a better condition. Calling
EmptyWorkingSet
is maybe not the best solution here (and its also non-blocking). Some other approaches would be welcome. I also assume that putting EmptyWorkingSet in the API proposal would cause unwanted effects. But I have put it here for discussion.It would be good if the API methods are abstracted somehow and the true API is not exposed. It is of course welcome to discuss this proposal.
Also, I'm not too advanced in memory mapped files.
API Proposal
API Usage