Closed vosian closed 2 years ago
I tried with another site and got the same problem, morss only grabs one "article", command used: morss --items "//*[class=bz_comment]" "https://bugzilla.kernel.org/show_bug.cgi?id=60824"
maybe try setting up caching (sqlite will probably do if you have a small installation) and/or increase MAX_TIME/ITEM? see https://git.pictuga.com/pictuga/morss#environment-variables
Adding CACHE=sqlite MAX_ITEM=10 MAX_TIME=120
changed nothing, I'm still getting a single article.
I added DEBUG=1 (CACHE=sqlite MAX_ITEM=10 MAX_TIME=120 DEBUG=1 morss --items "//*[class=bz_comment]" "https://bugzilla.kernel.org/show_bug.cgi?id=60824"
) as well and got the following:
error.txt
It caught my attention that there are 171 lines of "dropped", I went to check on the site, and by using document.getElementsByClassName("bz_comment")
on the browser I could see that there are 172 comments in total, so I figure each "dropped" represents an "article" that's being ignored for some reason.
morss --items "/html/body/div[4]/div/main/div[2]/div/div[2]/div/div/div/div/h4/a" "https://github.com/pictuga/morss/tags"
this also gives a single result, I don't know if there's something I'm messing up on my side, but I don't understand why morss is dropping all but 1 entry for me, as far as I could see there should be no setting forcing it to take only 1.
While I'm aware that the following site has an RSS feed, I tried morss directly on the site to test the issue I described. And here as well I'm getting a single articles and several "dropped" notices.
morss --items "/html/body/div[1]/div[5]/div/div/div[1]/div/div[2]/div/div/div/div/div[1]/div/div[2]/div[1]/h2" https://pcsx2.net/
It might be worth noting that every single site I've tested morss on has returned a single feed, and I'm at my wits' end trying to find out what I'm doing wrong.
Have you tried with SQLITE_PATH? Default path is in-memory and therefore cleared every time.
Also, have you checked what happens when adding --proxy
?
SQLITE_PATH has no effect on entries being dropped, however, when using --proxy it seems no entries are dropped, so in the case of https://pcsx2.net/ it picks up 5 articles.
Maybe after a certain character limit it's dropping everything else you throw at it? just a baseless guess.
Trying this again after a long time (I did updated morss) the articles seems to be fetched correctly, so this can be fixed.
Hello, and sorry if it's a mistake on my part, but when trying to make a rss feed for https://shonumi.github.io/articles.html it only ends up grabbing the first article. I'm using the following command
morss --items "//*[class=inner_text_large]" https://shonumi.github.io/articles.html
Using the website and selecting that element selects 5 articles. For some reason the cli version is stopping at the first one.
My version is current as I installed morss today.