Open mattmartini opened 9 months ago
Thanks, seems that writing to the file system needs to be limited as well (besides limiting the number of open network connections)
On linux systems, ulimit -n
shows the number of open files allowed (see https://ss64.com/bash/ulimit.html). Often, the default is 1024. Could you check what's the value on your OS?
$ ulimit -n 256
On Nov 28, 2023, at 11:01 PM, quambene @.***> wrote:
On linux systems, ulimit -n shows the number of open files allowed (see https://ss64.com/bash/ulimit.html). Often, the default is 1024. Could you check what's the value on your OS?
— Reply to this email directly, view it on GitHub https://github.com/quambene/bogrep/issues/59#issuecomment-1831180559, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAM6OQCMJTVZOBZ2NG74LLYG2XRBAVCNFSM6AAAAAA76RAQY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZRGE4DANJVHE. You are receiving this because you authored the thread.
Above fix did not solve the problem.
$ bogrep init
Imported 8007 bookmarks from 2 sources: /Users/USERNAME/Library/Application Support/Firefox/Profiles/8cycwilz.Everyday_Usage/bookmarkbackups/bookmarks-2023-11-29_9410_DCFJZCTgp91yEKP+oZsoDA==.jsonlz4, /Users/USERNAME/Library/Application Support/Google/Chrome/Default/Bookmarks
Error: Can't create file at /Users/USERNAME/Library/Application Support/bogrep/cache/97d40634-be8c-4332-987b-6bb9f576d2e4.txt: Too many open files (os error 24)
$ (197/8007)
sorry, it's not fixed yet. I will let you know when it's ready
Fixed on main branch. Will prepare release 0.6.0.
I set the default for max_idle_connections_per_host
in settings.json
from 100 to the more sensible 10. Idle connections were stuck in the connection pool.
Idle connections will be removed after 5 seconds (see idle_connections_timeout
in settings.json
).
See the documentation: https://docs.rs/bogrep/latest/bogrep/struct.Settings.html#fields
Please remove your bogrep config folder with the old settings.json
before running bogrep.
If you still get a "Too many open files" error, try to decrease max_idle_connections_per_host
to 5, and I will update the default accordingly.
"max_concurrent_requests": 10,
"max_idle_connections_per_host": 5,
Seems to work (slowly). However got this error (I believe this is a chrome bookmark)
Error: Can't get host for url: javascript:void(location.href='http://tinyurl.com/create.php?url='+encodeURIComponent(location.href))
Pushed an improvement to main branch: https://github.com/quambene/bogrep/pull/63
Fetching will not be aborted any more if an expected error occurs, so you should be able to finish processing (but with a few warnings instead).
You can try to set max_concurrent_requests
to 100 again.
There is still something odd with macOS which I have to investigate. On Ubuntu, I'm able to fetch 500 concurrent requests, and it's finished quickly.
Bumped max_concurrent_requests
back up to 100. Still getting many Too many open files
errors. Some for creating the cache file, and some for fetching a website.
Only 2783 cache files created out of 8007 bookmarks. This seems very low, sure there are probably many dead links in the bookmarks, but not 71%.
[2023-11-30T23:37:34Z WARN bogrep::cmd::fetch] Can't create file at /Users/USERNAME/Library/Application Support/bogrep/cache/720450d1-9108-4e76-a2ff-1b1431cca0c5.txt: Too many open files (os error 24)
[2023-11-30T23:37:34Z WARN bogrep::cmd::fetch] Can't fetch website: error sending request for url (http://www.j.nurick.dial.pipex.com/Code/Perl/index.htm): error trying to connect: dns error: proto error: io error: Too many open files (os error 24)
[2023-11-30T23:36:44Z WARN bogrep::cmd::fetch] Can't fetch website: error sending request for url (http://support.apple.com/kb/HT1159#mac_pro): error trying to connect: dns error: proto error: io error: Too many open files (os error 24)
Dropped max_concurrent_requests
to 20. Much slower ( 15:58 min to try and fetch 8007 URLs), but only got warnings for javascript: bookmarks.
[2023-11-30T23:50:11Z WARN bogrep::cmd::fetch] Can't get host for url: javascript:void(location.href='http://tinyurl.com/create.php?url='+encodeURIComponent(location.href))
Retrieved 5447 out of 8007 bookmarks.
On another note, you should do point releases like v0.6.1 ;-)
Replaced package `bogrep v0.6.0 (/Users/USERNAME/Projects/BookMarks/bogrep)` with `bogrep v0.6.0 (/Users/USERNAME/Projects/BookMarks/bogrep)` (executable `bogrep`)
Thanks for checking!
71% error rate is indeed too much and is explained by the "Too many open files" errors which prevents fetching and caching.
I would have expected that 100 concurrent requests would work without issues though.
For example, on Ubuntu I was fetching 500 concurrent requests successfully which is explained by:
500 open files + 500 open connections = 1000 linux sockets < 1024
where 1024 is the limit for open files on Ubuntu.
The same calculation doesn't seem to work for macOS, where we have:
100 open files + 100 open connections = 200 sockets < 256
I will dig a bit more why the expected 100 for max_concurrent_requests
is not working on macOS.
Unfortunately, most releases include breaking changes, that's why I'm increasing the minor version. Next release includes a bugfix without breaking changes though and will be v0.6.1 :)
When trying to do a
bogrep fetch
I am getting the error below.$ bogrep fetch
Error: Can't create file at /Users/USERNAME/Library/Application Support/bogrep/cache/78aa542f-52c1-4b5e-b475-15293854996a.txt: Too many open files (os error 24)
$ (140/8005)
I tried setting
"max_concurrent_requests": 50,
and still get this issue.OS: Darwin 23.1.0 - macOS 14.1.1 (Sonoma) version: bogrep 0.5.0