Closed rbugajewski closed 1 year ago
Fever API limit maximum article size to 50 for each request(Miniflux origin API has similar limitation). So elfeed-protocol
call curl command only fetch 50 articles in each update operation. This should be the performance bottlenecks if there are huge number of unread or starred articles. How ever, if your network is OK, I don't think fetch 1000 articles will take more than 10 minutes.
Besides, if you want to fetch huge number of articles besides the first sync operation, the following code will be useful:
(setq my-elfeed-update-timer
;; decrease seconds depend on your network performance
(run-at-time 15 15
(lambda () (when (= elfeed-curl-queue-active 0)
;; use elfeed-protocol-ttrss-update-older instead for older articles
(elfeed-protocol-ttrss-update "ttrss+https://user@host")))))
(cancel-timer my-elfeed-update-timer)
Thanks for your fast response.
However, if your network is OK, I don't think fetch 1000 articles will take more than 10 minutes.
This speed would be perfectly acceptable as I could run the initial sync overnight, the incremental syncs will be much faster. Currently I can’t achieve such speeds though, but I added your hints to my configuration which now looks like this and I will run it overnight once again:
(use-package elfeed-protocol
:after elfeed
:config
(setq elfeed-use-curl t)
(defadvice elfeed (after configure-elfeed-feeds activate)
"Make elfeed-org autotags rules works with elfeed-protocol."
(setq elfeed-protocol-tags elfeed-feeds)
(setq elfeed-feeds (list
(list "fever+http://admin@news.example.com"
:api-url "http://news.example.com/fever/"
:password "p4s5w0rd"
:autotags elfeed-protocol-tags))))
(setq my-elfeed-update-timer
;; decrease seconds depend on your network performance
(run-at-time 15 15
(lambda () (when (= elfeed-curl-queue-active 0)
;; use elfeed-protocol-ttrss-update-older instead for older articles
(elfeed-protocol-fever-update "fever+http://admin@news.example.com")))))
(cancel-timer my-elfeed-update-timer)
(elfeed-protocol-enable))
I’m not sure what the bottleneck is, because two other clients I have tested can sync much faster. The sync takes place over a 1 Gbps connection, and I verified that the network bandwidth isn’t saturated. The server’s load is way below its limit. Emacs on the client however takes 100% CPU during elfeed-update
.
I’ve tried to check how fast the sync is on my machine and I don’t think that it would finish overnight with this speed:
➜ ~/.elfeed/data for i in {1..6}; do find . -type f | wc -l; sleep 10; done
1284
1284
1284
1284
1325
1325
This is about 50 entries per minute.
This runs on a 2,9 GHz Quad-Core Intel Core i7 (Turbo Boost up to 3,8 GHz). Is there something else I can try to make the sync faster?
Oh, please ignore the last line in my example code, and the update timer only work for the incremental syncs.
(cancel-timer my-elfeed-update-timer)
This is about 50 entries per minute.
Looks your network bandwidth and computer performance are both fine, maybe the result JSON buffer is too large so emacs parse it slowly and takes 100% CPU all the time. You could catch the curl command line(just ignore prefix args --disable --compressed --silent --location -w
) and debug it in the console:
ps aux | grep curl
I’ve now a much faster machine and I wanted to retry my elfeed-protocol setup, but now I’m getting errors on the initial sync:
[error]: http://news.example.com/fever/?api&items&with_ids=260685,266085,324987,266082,266083,266086,266084,339460,283184,189692,264061,276098,270801,324989,294909,292525,308143,294544,262102,266087,284374,308409,325956,263106,266088,275414,299249,294833,308065,266089,278032,292528,275307,276075,262354,262604,266094,266093,266095,300231,271718,266092,262945,263295,264418,266096,264640,266097,262231,262232: "(126) Unknown curl error!"
[error]: http://news.example.com/fever/?api&items&with_ids=271721,271720,356356,278034,270728,278036,270727,278035,275053,287347,270867,260904,261329,325216,294837,260850,261251,330763,261202,278233,262610,262952,264724,266255,261437,271507,289059,262592,262589,260674,238585,284278,260792,260152,263186,287349,287348,261813,276345,286386,265802,263435,259634,264702,264908,268874,289060,263437,313130,287353: "(126) Unknown curl error!"
How can I debug this?
I'm not sure, how about you run curl directly in bash:
curl -H'User-Agent: Emacs Elfeed 3.3.0' -XPOST -d api_key=$(echo -n 'user:pass' | md5sum | awk '{print $1}') 'https://news.example.com/fever/?api&feeds'
curl -H'User-Agent: Emacs Elfeed 3.3.0' -XPOST -d api_key=$(echo -n 'user:pass' | md5sum | awk '{print $1}') 'https://news.example.com/fever/?api&saved_item_ids'
curl -H'User-Agent: Emacs Elfeed 3.3.0' -XPOST -d api_key=$(echo -n 'user:pass' | md5sum | awk '{print $1}') 'https://news.example.com/fever/?api&unread_item_ids'
curl -H'User-Agent: Emacs Elfeed 3.3.0' -XPOST -d api_key=$(echo -n 'user:pass' | md5sum | awk '{print $1}') 'https://news.example.com/fever/?api&items&with_ids=260685,266085'
I'm not sure, how about you run curl directly in bash:
Thanks for the hint! I’ve cut the responses after some characters to make the comment more readable:
curl -H'User-Agent: Emacs Elfeed 3.3.0' -XPOST -d api_key=$(echo -n 'user:pass' | md5sum | awk '{print $1}') 'https://news.example.com/fever/?api&feeds'
{"api_version":3,"auth":1,"last_refreshed_on_time":1618387652,"feeds":[{"id":8,"favicon_id":0…
Response looks good.
curl -H'User-Agent: Emacs Elfeed 3.3.0' -XPOST -d api_key=$(echo -n 'user:pass' | md5sum | awk '{print $1}') 'https://news.example.com/fever/?api&saved_item_ids'
{"api_version":3,"auth":1,"last_refreshed_on_time":1618387667,"saved_item_ids":"12274"}
LGTM
curl -H'User-Agent: Emacs Elfeed 3.3.0' -XPOST -d api_key=$(echo -n 'user:pass' | md5sum | awk '{print $1}') 'https://news.example.com/fever/?api&unread_item_ids'
{"api_version":3,"auth":1,"last_refreshed_on_time":1618387679,"unread_item_ids":"260685,266085,324987…
LGTM
curl -H'User-Agent: Emacs Elfeed 3.3.0' -XPOST -d api_key=$(echo -n 'user:pass' | md5sum | awk '{print $1}') 'https://news.example.com/fever/?api&items&with_ids=260685,266085'
{"api_version":3,"auth":1,"last_refreshed_on_time":1618387710,"items":[{"id":260685,"feed_id":287,"title":…
This looks also like a correct, authenticated response.
Is there maybe a verbose option so that all curl commands get displayed in the log buffer so that I can debug further as these curl commands don’t seem to be the root of my issue?
Yes, all the curl commands executed correctly.
curl error 126
means “command not executable", it's strange because elfeed-protocol fetched feeds and unread ids with curl just now which means curl is executable. So I guess your elfeed-protocol-fever-feeds
is not empty. And you could execute the following code to check if elfeed-curl-enqueue
is executable:
(elfeed-curl-enqueue "http://news.example.com" (lambda (status) (message "%s" (buffer-string))))
Besides, you could show more elfeed-protocol logs with:
(setq elfeed-protocol-log-trace t)
(setq elfeed-protocol-fever-maxsize 5)
I don't use MacOS, you have to debug it yourself. For example you could check if elfeed works without elfeed-protocol and append debug code under elfeed-curl-retrieve, good luck:
(elfeed-log 'debug "curl args: %s" args)
@fasheng Was there some update I’m unaware off that took care of this issue, or what is the reason why this issue was closed as completed without comment?
I have no idea with the issue, miniflux+fever+curl
works fine for me everyday, if long time no update and could not reproduce in my machine, I will close it. If you have any thing new about the debug work that mentioned before, just append comment.
I'm afraid you must debug it yourself:
emacs-Q
slowlyOK, thanks for the clarification, and the input. I appreciate it.
macOS Big Sur 11.0.1
27.1
3.3.0
0.8.0
7.64.1
elfeed-protocol-xxx-feeds
empty: Probably not, how to check when Emacs does not react to user interaction onceelfeed-protocol
is running?elfeed-log
It is very hard to collect any logs, because Emacs does not react during sync, but from what I can see the Fever API is handled correctly.
get entries
is called repeatedly with different entry IDs. Then at some pointretrieve
is called which hangs for several minutes:I had to cut off the log output, because GitHub doesn’t allow such large comments.
error backtrace
No error is present, this is a sync performance issue. As far as I understand
elfeed
keeps one file for each entry. After runningelfeed-update
withelfeed-protocol
integration for the whole night, then doing awc
on~/.elfeed/data
there were about 6000 files/entries.Miniflux is a a RSS client that focuses on performance. It can easily handle tens of thousands of entries. A typical client syncs my setup in minutes. I expect that a first sync with 100.000 entries should take no longer than 30 minutes.