Open Largo opened 1 year ago
Test report with Ruby 3.2.0-1
on Windows 11.
with Ruby 3.1.3-1
on Windows 11.
After a lot of learning, I've gotten the debugger working. I'll send a merge request to ruby-loco that will explain how to compile with the debug symbols and use gdb. With the VSCode "gdb-debugger beyond" extension, I could now even set breakpoints. I hope this will help to get more people to help out with the windows version of ruby.
~~In the attached csv above you can see that for instance QueryNameInformationFile is called on C:\tmp (the base path) 692 times, which must be a waste, since it wouldn't change so often. Out of the 39557 syscalls there must be quite a few duplicates. I think it's mostly the winnt_stat function that should run less often. Maybe we can cache the result?~~
I now believe it's also due to the fact, that ALL gem folders are being searched for files and a lot of syscalls seem to be just duplicates. The wint_stat function is called a bit too often as well. I'm kind of out of ideas at this point. Concurrent-ruby does have a bit too many files for windows to handle quickly, I suppose.
Strange example paths where require searches:
ruby\gems\3.3.0+0\gems\benchmark-0.2.1\lib\tzinfo.so
ruby\gems\abbrev-0.1.1\lib\concurrent\concurrent_ruby_ext.so
ruby\gems\3.3.0+0\gems\base64-0.1.1\lib\tzinfo.rb
Compile Ruby on Windows and Debug with GDB
Edit:
Found this related effort
So after learning a bit more about ruby gems, I'd say its rubygems and it's overwriting of Kernel.require that loads a lot of files. There is Shopify/bootsnap (doesnt work with windows) and the faster_require (no benefit)gem, which have interesting approaches. I've tried a lot with yomikomu. Yomikomu makes it a bit faster, if you compile the instructions and put them into a database. The problem is, that it still does all of those slow syscalls to the files.
The most obvious approach is to merge all files of a gem into one file. I can't get this to work because I can't use Kernel.require to load from a string instead of a file... There are reasons for this of course. Kernel.eval doesn't seem to work for this purpose either.
learnings: Kernel.require adds the path to $" / $LOADED_FEATURES, if it is there it won't load it again. You can start ruby without rubygems enabled
Edit: Why can I File.read all the files for gtk3 on the loadpath in 0.1 seconds but with require takes 4 seconds?
I filed a bug upstream: https://bugs.ruby-lang.org/issues/19378
Another report in my recent work:
load
a file with 20ms
load
the same file with only 2ms
The influence is really huge.
Besides, I'm trying to accelerate the speed of gem bin command on Windows, but met a huge astonishment.
On Windows
On Linux
For your information, the process of ruby -e ""
even didn't load rubygems
.
OT: Happy New Year to all contributors and thank you so much for all the work you do for us windows rubyists! RubyInstaller is a fantastic project and really important for the community. I'd donate, but I cannot find any donation information. I live in Japan and if there's anything I could do, to help this project, for instance try to get funding from the Ruby Association, let me know!
Call for Feedback
For now, I'm most interested to know, if the rubyinstaller devs are aware of this and have already tried to improve this. And if there are any ideas on how to improve this and in which project (rubyinstaller, ruby, msys, windows itself). It's not a huge problem and so it's okay as it is, but it would be helpful for ruby scripts that are run in sequence.
What problems are you experiencing?
Require take xx times longer than on linux. If the gem has a lot of files like ruby-concurrent (dependecy of tzinfo), it takes up to 2 seconds on my machine.
From what I've researched, I assume it's the File IO, which is making it slow and not ruby itself.
I heard stat is slow on windows, maybe it is something like that and we could replace the syscall for windows?
Steps to reproduce
Tested on various different Windows 10-11 systems, with ruby 2.x to 3.2.
Sysinternal Process Monitor shows Events like QueryAllInformationFile and QueryInformationVolume, which might not be necessary.
What's the output from
ridk version
?related slow ruby require concurent Optimize for faster loading time and win32/file.c refactoring
Update: read the source code of win32.c and file.c for a bit and made my own build using ruby-loco. Sadly, not my area of expertise, but I found the following previous efforts:
Optimize for faster loading time and win32/file.c refactoring
The Ruby Concurrent Gem consists of a lot of files that are all loaded just to use Concurrent::Map in tzinfo. Loading files on Windows is a lot slower. There are 39558 API calls after require tzinfo: CreateFile, ReadFile, CloseFile, CreateFileMapping, LoadImage, UnlockSingleFile ,LockFile, QueryAllInformationFile, QueryIdInformation, QueryNameInformationFile, QueryNormalizedNameInformationFile, QueryInformationVolume. Some of those probably for a statcall. Multiple just to check if a directory is there (can be optimized?)
I recompiled Ruby to load all files with the FILE_FLAG_SEQUENTIAL_SCAN (O_SEQUENTIAL) Flag, but it didn't make it (much) faster.
I've read that NtCreateFile might be faster. but couldn't test that.
Logfile.CSV
Update 2023-01-12: Maybe we can put the File Close call after each require into a separate thread? https://gregoryszorc.com/blog/2015/10/22/append-i/o-performance-on-windows/